- Open Access
Oxalobacter formigenes-associated host features and microbial community structures examined using the American Gut Project
Microbiomevolume 5, Article number: 108 (2017)
Increasing evidence shows the importance of the commensal microbe Oxalobacter formigenes in regulating host oxalate homeostasis, with effects against calcium oxalate kidney stone formation, and other oxalate-associated pathological conditions. However, limited understanding of O. formigenes in humans poses difficulties for designing targeted experiments to assess its definitive effects and sustainable interventions in clinical settings. We exploited the large-scale dataset from the American Gut Project (AGP) to study O. formigenes colonization in the human gastrointestinal (GI) tract and to explore O. formigenes-associated ecology and the underlying host–microbe relationships.
In >8000 AGP samples, we detected two dominant, co-colonizing O. formigenes operational taxonomic units (OTUs) in fecal specimens. Multivariate analysis suggested that O. formigenes abundance was associated with particular host demographic and clinical features, including age, sex, race, geographical location, BMI, and antibiotic history. Furthermore, we found that O. formigenes presence was an indicator of altered host gut microbiota structure, including higher community diversity, global network connectivity, and stronger resilience to simulated disturbances.
Through this study, we identified O. formigenes colonizing patterns in the human GI tract, potential underlying host–microbe relationships, and associated microbial community structures. These insights suggest hypotheses to be tested in future experiments. Additionally, we proposed a systematic framework to study any bacterial taxa of interest to computational biologists, using large-scale public data to yield novel biological insights.
Oxalate is both a dietary constituent  and a product of endogenous human metabolism [2, 3]. Excessive oxalate accumulation can promote pathological conditions, including kidney stones [4,5,6], joint effusions, arthralgias [7, 8], and breast cancer . However, unable to catabolize oxalate, humans rely on oxalate degradation performed by commensal bacteria and intestinal and urinary excretion to decrease the circulating oxalate levels .
Oxalobacter formigenes degrades oxalate as its sole energy and carbon source  within the gastrointestinal (GI) tract of its hosts, in contrast to other known oxalate-degrading bacteria, nearly all of which only metabolize oxalate using detoxification pathways under specific conditions [11, 12]. A second physiologic role of O. formigenes in host homeostasis is in stimulating oxalate transport through the gut epithelium, promoting its release into the GI tract lumen .
Although the potential of O. formigenes to remove excessive oxalate from the host has been long recognized [14,15,16,17,18,19], re-introduction of the organism to humans have not yielded definitive results [15, 17, 20, 21]; this may reflect the insufficiency of in vivo models, small study sizes, failure to select proper O. formigenes strains, and/or the lack of sustained colonization.
In the present study, we aimed to examine O. formigenes colonization patterns in humans and to explore the underlying ecological relationships using the American Gut Project (AGP). The AGP has surveyed the intestinal microbiome in more than 8000 people using standard pipelines, and with detailed host metadata, which permits studies of O. formigenes-centered ecology. Our study illustrates a systematic framework to examine key bacteria present in large public datasets to ascertain their biological relationships with their hosts.
Detection of O. formigenes OTUs in AGP samples
In total, 9746 and 9550 AGP samples were processed using QIIME’s closed- and open-reference operational taxonomic unit (OTU)-picking methods [10, 22, 23] (Additional file 1: Figure S1) yielding a total of 3 and 260 OTUs, respectively, that were classified as O. formigenes by the closed- and open-reference OTU-picking methods (Additional file 1: Figure S1). Samples with ≥1000 seqs/sample were used for the initial examination of O. formigenes prevalence and abundance (Table 1). The three OTUs that were detected by both methods were Greengenes  OTUs 7366, 360508, and 7369 (Table 1); the other 257 OTUs were detected only by the open-reference OTU-picking method at low abundance, each accounting for <1% of the total O. formigenes-associated reads (Table 1). As such, results of the closed-reference OTU-picking method were used for downstream analyses.
The most dominant OTU 7366 was detected in 27.6% of all samples, accounting for 96.4% of the O. formigenes-associated sequencing reads (Table 1). OTU 360508 was detected in 7% of the samples (Table 1), accounting for 3.5% of the total O. formigenes reads. OTU 7369 was detected in <1% of the subjects (Table 1).
O. formigenes was chiefly found in fecal samples (Table 2), consistent with prior findings on a smaller scale [10, 15]; as such, we focused only on fecal samples in subsequent analyses. In fecal samples in which OTU 7366 was detected, its geometric mean relative abundance was 2.9 × 10−4 and ranged from 10−6 to 10−3. For OTUs 360508 and 7369, the mean relative abundances were ~10-fold lower in the samples in which they were detected (Fig. 1a).
Classification of O. formigenes OTUs
O. formigenes strains studied to date have been divided into two subgroups based on biological heterogeneity, including cellular fatty acid content [15, 25] and length variation of key genes (frc and oxc) . Based on the full 16S rRNA sequences, the 13 group I and group II O. formigenes strains also cluster into distinct clades by a deep branching of the phylogenetic tree (Additional file 2: Figure S2A). The AGP O. formigenes OTUs differ in their 16S V4 sequence similarity to the 13 reference O. formigenes strains (Fig. 1b). The V4 region of OTU 7366 is 100% identical to that of O. formigenes strain BLISS (Additional file 2: Figure S2B), initially isolated from human feces in 1996 [27, 28]. Strain BLISS is located on a separate branch of the phylogenetic tree but shares a common root with all group I O. formigenes strains (Additional file 2: Figure S2A). OTUs 360508 and 7366 are most similar to group II O. formigenes strains HOxBLS and OXCR, at 98.8 and 100% identity, respectively (Fig. 1b, Additional file 2: Figure S2B).
Selection of samples for downstream analysis
More O. formigenes OTUs were detected as sequence depth became higher (Additional file 3: Figure S3A). To alleviate potential undersampling, we raised inclusion criteria from ≥1000 seqs/sample (Additional file 3: Figure S3A) to ≥10,000 (Additional file 3: Figure S3B), which included 5336 fecal specimens from 4945 subjects. Among those, we focused on the 4945 AGP fecal specimens, with the subjects who provided multiple specimens (n = 184; Additional file 4: Figure S4) represented only by the first specimen provided.
Co-detection of O. formigenes
Since our preliminary analyses suggested that the observed O. formigenes OTUs may differ from known O. formigenes strains (Fig. 1b), we reasoned that interactions between those OTUs should provide broad insight into the intraspecies O. formigenes dynamics. Frequent co-colonization of the two dominant OTUs was observed (Fig. 1c) (p < 10−27, significance of overlap test performed via ). In 96.8% of the samples in which OTU 360508 was detected, OTU 7366 was co-detected. Similarly, OTU 7369 completely overlapped with the samples in which OTUs 7366 and 360508 were detected (Fig. 1c). The relative abundances of both OTUs 7366 (Fig. 1d, p < 0.001) and 360508 (Fig. 1e, p < 0.001) were significantly elevated when co-detected.
Longitudinal colonization of O. formigenes
To investigate O. formigenes colonization over time, the 184 subjects who provided multiple fecal samples (Additional file 4: Figure S4) were divided into three groups depending on whether O. formigenes was detected in the following: (i) none of the samples (n = 100 subjects), (ii) at least one sample (n = 44 subjects), or (iii) all the samples (n = 40 subjects) provided (Fig. 2a). OTU 7366 was significantly more abundant in samples with detection from group III than from group II (Fig. 2b, left). A parallel trend was noted for OTU 360508 (Fig. 2b, middle) and for the sum of both OTUs (Fig. 2b, right). These observations were likely not due to sequencing depth bias, since total sequences/sample did not differ between samples with detection from groups II and III (p value = 0.96, Mann–Whitney test).
Host features associated with detection of O. formigenes colonization
Previous studies [14, 30,31,32] examining the association of O. formigenes colonization with age, sex, and antibiotic exposure history were conducted in small populations. The AGP, with large sample size and detailed metadata available, is a robust data source to explore the underlying O. formigenes–host relationships. Based on the prior literature, data availability, and biological relevance, we focused on 14 candidate covariates (Table 3) describing host demographic and clinical features to predict O. formigenes abundance.
Univariate analyses between covariates and O. formigenes abundance were performed for each covariate independently using the 4945 samples (Table 3). To avoid the confounding effects from highly correlated covariates and to minimize false discovery, we further performed multivariate analysis using a multiple zero-inflated negative binomial (ZINB) model . The rationales for model selection and strategies were described in detail in the “Methods” section. The fitted ZINB model consists of two different components, the logistic regression for modeling excessive zero abundances and the negative binomial regression for modeling the remaining count values. Here, we refer to the population for excessive zero abundances as the population from which O. formigenes is not detected and the other population for the remaining abundances as the population in which it is detected . Based on the fitted ZINB model, we found that age, sex, race, BMI, alcohol drinking frequency, antibiotic use history, country of residence, and level of education are significantly associated with the probability of O. formigenes detection by the logistic regression component (Table 4) and BMI and thyroid status are significantly associated with O. formigenes detection by the negative regression component (Table 4) .
To estimate the overall effect direction and magnitude, we calculated the overall fitted mean proportions (%) (see the “Methods” section) as measurements of O. formigenes relative abundance for the nine covariates that were significant in either the logistic or negative binomial regression component (Fig. 3, Table 4) . To summarize the outcomes, adjusted for the other covariates, we estimate that relative abundance of O. formigenes is associated with increased age (Fig. 3a), female sex (Fig. 3b), Caucasian ethnicity (compared to Asians, Pacific Islanders, Hispanics, African Americans, or for persons of other ethnicities) (Fig. 3c), non-USA residence (Fig. 3d), normal BMI (compared with underweight, overweight, or obese) (Fig. 3e), absence of antibiotic exposure within a year (Fig. 3f), alcohol consumption (Fig. 3g), higher educational attainment (Fig. 3h), and normal thyroid function (Fig. 3i). In another analysis, we showed relationships of O. formigenes presence and the locality of the subject’s birth and present residence (Additional file 5: Table S1).
In the AGP, participants could complete the Vioscreen questionnaire , a validated dietary instrument calculating dietary intake in the preceding 90 days. We were specifically interested in assessing the relationships between O. formigenes with dietary oxalate and calcium because of the role of oxalate in O. formigenes growth and its unavailability when complexed with calcium [1, 5]. Among the 197 participants who provided the questionnaire, O. formigenes relative abundance was not significantly associated with dietary oxalate (Additional file 6: Figure S5, left), but was inversely and significantly associated with dietary calcium (Additional file 6: Figure S5, middle; p = 0.028 by Spearman rank correlation test), and also inversely associated with the ratio of dietary oxalate to calcium (Additional file 6: Figure S5, right; p = 0.002).
Microbial community characteristics in relation to detection of O. formigenes
Among the 4945 subjects, as the number of O. formigenes OTUs detected increased, phylogenetic diversity  also increased (Fig. 4a). This relationship was noted when subjects of the USA (Fig. 4b) and UK–Ireland (Fig. 4c) were analyzed independently. The association between increasing α-diversity and detection of O. formigenes remained consistent when the α-diversity was assessed using Chao1 , Shannon index , or observed number of OTUs, with rarefaction (Additional file 7: Figure S6) to adjust for sequencing depth bias.
To examine microbial community structure (β-diversity), we built a sample of 824 specimens that met rigorous inclusion criteria (Additional file 5: Table S2). Based on both unweighted UniFrac distances  (Fig. 5a) and Bray–Curtis dissimilarities  (Additional file 8: Figure S7A) visualized by principal coordinate analysis (PCoA), there was substantial overlap in the samples in which either one or two O. formigenes OTUs were detected. However, both significantly differed from the samples in which O. formigenes was absent. The mean pairwise intergroup distances between the O. formigenes+ and O. formigenes− groups were significantly greater than the corresponding intragroup distances within the O. formigenes+ groups (Fig. 5e, Additional file 8: Figure S7E). The same pattern was observed in samples from subjects from the USA (Fig. 5b, f, Additional file 8: Figure S7B, F), or UK–Ireland (Fig. 5c, g, Additional file 8: Figure S7C, G), when analyzed independently. We also compared the true intragroup distances with average intragroup distances from bootstrapping samples but did not see any scalable differences.
Microbial networks in relation to O. formigenes colonization
We next examined the microbial networks using SPIEC-EASI , where the pairwise microbial interactions were estimated via absence of conditional independence, using the stability approach to regularization selection (StARS) method (as implemented in the pulsar package in R) for model selection. For StARS, we used the default β = 0.05 as the threshold parameter, which measures the fraction of the network that is 1 − β stable over random subsamples. To predict interactions between O. formigenes and other microbial species, we constructed three networks using all 4945 samples, or subsets of the 3935 US or 830 UK samples. Five bacterial species and an Archaeaon (Table 5) were predicted to interact with O. formigenes in at least one of the networks. A negative interaction between Ruminococcus gnavus and O. formigenes was observed in all three networks (Table 5).
To understand the observed differences in microbial community structure between O. formigenes+ and O. formigenes− samples (Fig. 5), we then inferred two separate networks. We first compared the two networks in terms of centrality of nodes, in which higher values indicate that the node is involved in more ecological interactions. Nodes in the O. formigenes+ network had significantly higher degree  and betweenness  centrality (p = 0.03 and 0.02, one-sided Mann–Whitney tests) than in the O. formigenes− network (Fig. 6a), suggesting greater dispersion within the O. formigenes− network.
We then compared the resilience of the networks to disturbance, using sequential node removal to simulate “attacks” to the networks, an approach previously described . Natural connectivity—the number of closed walks —was used to assess robustness of remaining networks (Fig. 6b). Node removals were performed either by first removing the hub species—nodes with the highest degree (Fig. 6b, left) or betweenness centrality (Fig. 6b, middle)—or at random (Fig. 6b, right). In the degree-based node removal, natural connectivity dropped faster in the O. formigenes− network compared to O. formigenes+, indicating that the O. formigenes− networks might collapse faster under “ecological attack” when important species were affected.
O. formigenes+ networks also were associated with significantly higher natural connectivity before any nodes were removed (Fig. 6b). To alternatively examine this observation, we performed network inference repeatedly, obtaining a distribution of natural connectivity associated with the O. formigenes+ or O. formigenes− networks. The O. formigenes+ networks had significantly higher natural connectivity (Fig. 6c, first panel). Seven other microbial species, whose prevalence and abundance were at the same order of magnitude as O. formigenes, were selected as control taxa. Among them, two unclassified species from family Dehalobacterium or rc4-4 showed the contrasting pattern that their presence were associated with networks with lower connectivity; the presence of Rothia mucilaginosa, Rothia dentocariosa, Campylobacter unclassified, cc_115 unclassified, and Neisseria subflava showed the same differential pattern as O. formigenes (Fig. 6c).
This study has four major findings:
We confirm extensive variation in O. formigenes relative abundance , now with much larger numbers of subjects, indicating a range of approximately 3 log10.
We confirmed and extended relationships between O. formigenes prevalence and host features, at considerably larger scale, and now with analyses in relation to both O. formigenes prevalence and abundance.
We showed that the host gut microbiota displayed greater community diversity, global network connectivity, and greater resilience to simulated disturbance, in the samples in which O. formigenes was detected.
Our data suggest that O. formigenes strains resembling strain BLISS, a putative phylogenetic group I strain, might be the most prevalent and abundant strains in the human gastrointestinal tract, while group II O. formigenes strains (as exemplified by HOxBLS and OXK) are less common, consistent with a prior study . Knowing which strains are naturally dominant human gut colonizers is important for designing long-term clinical interventions. For example, strain HC1 was administered to primary hyperoxaluria patients, achieving promising short-term urinary oxalate reduction but failed to maintain long-term colonization and significant oxalate reduction [17, 20, 21]. All three O. formigenes-associated OTUs we detected showed low 16S sequence similarity to HC1 (Additional file 2: Figure S2B), suggesting that it may not be a common human colonizing strain.
Our findings that O. formigenes OTUs tend to be co-detected are consistent with our prior work, which used metagenomic data from the Human Microbiome Project . Thus, the shotgun metagenomic and 16S analyses provide congruent results, despite the methodological and source population differences. We also now show that those OTUs are detected at elevated abundance when co-present; these observations suggest inter-strain cooperation among O. formigenes strains, which needs experimental support. If proven, these findings would suggest that at a clinical level, O. formigenes re-introduction for kidney stone prevention might be facilitated if multiple strains could be administered simultaneously. Alternatively, those observations may reflect cross-feeding, co-aggregation, niche overlap, or use of the same host resource, but without competition. We also cannot rule out technical artifacts, in which 16S amplicon sequencing and metagenomic sequencing (with two reference genomes) were both unable to distinguish O. formigenes at the strain level and the single taxonomy units that we and others  used were mixtures of multiple O. formigenes strains.
The longitudinal analyses showed a considerable proportion of subjects switching O. formigenes status. It is unlikely that O. formigenes colonization is lost and regained at such high frequency; more likely, limited sequencing depth in the assays underestimates O. formigenes presence in samples in which they are present at low abundance. This explanation is supported by our finding that in subjects in whom O. formigenes was detected at lower relative levels, detection was more variable (Fig. 2a, middle) than those who are colonized at relatively high levels (Fig. 2a, right). This difference justifies restricting our analysis to samples with >10,000 assigned sequences. Better understanding and investigation of O. formigenes biology and its therapeutic applications may require deeper sequencing and more curated reference genomes. These limitations will be relevant to approaches for other low abundant taxa of potential medical interest.
Prior smaller studies have linked O. formigenes prevalence with host age, sex, location, education, race/ethnicity, and oxalate consumption [31, 32, 48]. Our multivariate analysis further interrogated the underlying relationships, dealing with the technical issue of microbial data sparsity using a robust statistical model [49, 50]. Our observations that males carry O. formigenes at significantly lower abundance than females (Fig. 3b) correlate with the unexplained twofold higher kidney stone incidence in males compared to females [4, 51].
Prior studies have identified oxalate deposition within the thyroid gland in humans and other mammals [52,53,54,55,56,57,58,59,60,61,62]. A study of healthy thyroid tissues obtained at autopsy from 182 individuals found a negative correlation between spatial distributions of calcium oxalate crystals and triiodothyronine (T3)-producing colloids , suggesting that oxalate precipitation may be inversely related to normal thyroid function. Our observations that subjects diagnosed with thyroid diseases were colonized by O. formigenes at lower prevalence (Table 3) and abundance (Fig. 3i) are consistent with the hypothesis that O. formigenes may be beneficial to host thyroid function by lowering circulating oxalate. However, the sample size of AGP subjects with (n = 363) or without (n = 2490) thyroid disease is unbalanced, and the AGP includes no information of specific type of thyroid abnormalities. Thus, future targeted studies are needed to further examine this hypothesis.
That O. formigenes detection is correlated with higher phylogenetic diversity in the host microbiota (Fig. 4) is consistent with comparisons of US subjects and Amerindian hunter gatherers . The Amerindians had significantly higher diversity and nearly universal O. formigenes colonization at high abundance; O. formigenes was one of the most differentiating taxa between those two populations . These results may indicate that a more diverse gut microbial community has greater likelihood of harboring O. formigenes or alternatively could reflect technical issues such as differences in sequencing depth.
If the prevalence and abundance of O. formigenes is indeed an indicator of the ecological state of the microbiota, then the disappearing microbiota theory  might predict association with particular disease states that have risen while we have been losing diversity. Of interest to us are the recent studies describing rising prevalence of kidney stones in the USA  and other countries [65,66,67]. Furthermore, the linkage observed of O. formigenes absence to both obesity (Fig. 3e) and thyroid disease (Fig. 3i) is consistent with that hypothesis. Our analysis of inferred ecological networks also suggests that O. formigenes presence may be one indicator of host microbiota integrity and ultimately may be a marker for host physiology.
In conclusion, evidence is growing that O. formigenes has medical significance in humans [4, 5, 13, 15, 30, 32]. We show that the AGP is a valuable resource and present a systematic framework to explore the biology of O. formigenes in humans. The relationships that we observed illustrate the power of a crowd-sourced enterprise, if done on sufficient scale, to answer biological and medical questions. Nevertheless, the observations we report should be considered as hypothesis-generating; carefully designed experiments are needed to establish the underlying causal relationships and to help design targeted clinical interventions. Nevertheless, the systematic framework we built can be extended to study other bacterial taxa of interest.
Data acquisition and processing
In the AGP, samples were self-obtained by study participants using sample kits containing detailed instructions  and then shipped by mail to the AGP home lab, accompanied by completion of metadata questionnaires. Sample DNA extraction, library preparation, and sequencing were performed as described . The AGP consortium processed data with closed-reference OTU-picking pipeline . We also performed open-reference OTU picking using Uclust  to search for novel O. formigenes-associated OTUs. In our study, the raw sequences  and the closed-reference OTU-picking results  were directly downloaded from the ftp website and the metadata were shared by the AGP. All those data could also be alternatively acquired via the Qiita  website under study number 10317.
Phylogenetic analysis of O. formigenes strains and detected OTUs
Complete 16S rRNA gene sequences of 13 O. formigenes strains, three O. formigenes-associated OTUs, and Oxalobacter vibrioformis were downloaded from the Greengenes website . The V4 region was a subset based on the primers used for the AGP (FWD:GTGYCAGCMGCCGCGGTAA, REV:GGACTACNVGGGTWTCTAAT; 515FB-806RB) . Phylogenetic trees were constructed with the sequences of the complete 16S (Additional file 2: Figure S2A) or V4 region only (Fig. 1b) via phylogeny.fr . Four steps were performed: (i) initial alignment by MUSCLE 3.8.1, (ii) alignment refinement by Gblock, (iii) maximum likelihood phylogeny analysis by PhyML, and (iv) tree rendering by TreeDyn [75,76,77,78,79,80]. For detailed methods and parameters, refer to phylogene.fr website documentation section 2.1 . The sequence identity matrix (Additional file 2: Figure S2B) was calculated with (1 − dist)% by the dist.alignment function  from the seqinr package in R, based on the alignment results that were obtained during the 16S V4 region sequence alignment for Fig. 1b.
Sample diversity analyses
Intra-sample α-diversity was calculated using QIIME, using four metrics, phylogenetic diversity , Shannon index , Chao1 index , and observed number of OTUs, at rarefaction depths from 1000 to 10,000 sequences/sample. Pairwise inter-sample β-diversity was calculated using the unweighted UniFrac distance metric  and Bray-Curtis dissimilarities .
(i) Description of the multivariate model: A multiple zero-inflated negative binomial regression (ZINB)  model was used for the differential abundance analysis on O. formigenes, to handle its excessive zero abundances in its read count (64%) and the overdispersion (the mean 5.7 is much smaller than the variance 284.4). The ZINB model consists of two different components, a logistic regression for modeling the excessive zeros and a negative binomial regression for modeling the remaining count values. To adjust for the varying number of total read counts, both components of the ZINB model included the log(total read counts) as the offset variable such that the ZINB model assesses the proportions of O. formigenes rather than the count . The canonical link functions were used with logit for the logistic regression and log for the negative binomial regression. Missing data in each categorical variable was included into a separated hidden category . (ii) Variable selection: The variables were selected based on prior literature, data availability and quality, biological relevance, and investigators’ interest yielding 14 variables to be included in the analysis (Table 1). Among those, age, sex, and race were included as the baseline covariates to both the logistic and the negative binomial regression components. The other covariates were subsequently added to the regressions if they had univariate p value <0.1 either in logistic or negative binomial regression model (Table 1). The criterion p value <0.1 was used, which is less stringent than the canonical criterion p < 0.05, in considering that some covariates that are weak predictors in univariate analyses might be influential when in combination with other covariates. (iii) Model selection: We compared the performance of the ZINB model with zero-inflated Poisson (ZIP) , in terms of fitness to the data based on the same link functions, offset variable, and covariates. The ZINB model outperformed the ZIP model with smaller (a) Akaike information criterion (AIC)  [ZINB 18,177 vs ZIP 41,456], (b) Bayesian information criterion (BIC)  [ZINB 18,765 vs ZIP 42,035], and (c) log likelihood [ZINB −8997.6 (DF 91) vs ZIP −20,637.4 (DF 90); p < 0.001] by the likelihood ratio test, suggesting that the goodness-of-fit was significantly improved in ZINB over ZIP. (iv) Overall fitted mean proportions (%): Overall fitted mean proportions were calculated by the average predicted value (APV) method , which is predicted O. formigenes count values divided by the mean total read counts under each exposure status, as measurements to estimate the effect direction and magnitude on O. formigenes relative abundance.
Construction of microbial association networks
Network was constructed at species level by summing up all associated OTUs for each species. Species that were present in at least 20% of the samples were selected for network inference by SPIEC-EASI . The default setting of SPIEC-EASI accepts absolute abundance of taxa as input and applies centered log-ration transformation to eliminate the unit-sum constraint of data . Networks were constructed with the SPIEC-EASI  package in R in neighborhood selection mode with parameters set as method = “mb,” sel.criterion = “bstars,” lambda.min.ratio = 2e−1, nlambda = 100, pulsar.params = list(rep.num = 20, ncores = 2). The species–species interacting directions were predicted based on the average coefficients calculated via the beta matrix. Edge centrality and betweenness centrality were calculated using degree and centralization.betweenness functions with the igraph in R, normalized against the theoretical maximum. Natural connectivity, a variant of the Estrada index , was calculated as described , through the following equation:
where N is the number of nodes in the network and x is the average of eigenvalues of the graph adjacency matrix, which was calculated using get.adjacency and eigen R functions.
The distributions of natural connectivity  associated with O. formigenes+ and O. formigenes– networks were acquired from 100 iterations of network inference, randomly selecting 1000 O. formigenes+ and O. formigenes– samples in each iteration. Seven control species were selected based on their similarity to O. formigenes population parameters: detection rates of 30–40% and relative abundances of 10−2–10−3. The distributions of natural connectivity associated with presence or absence of the control species were calculated the same way as with O. formigenes.
American Gut Project
- GI tract:
Operational taxonomic units
Zero-inflated negative binomial model
Zero-inflated Poisson model
Sparse Inverse Covariance Estimation for Ecological Association and Statistical Inference
Holmes RP, Ambrosius WT, Assimos DG. Dietary oxalate loads and renal oxalate handling. J Urol. 2005;174:943–7. discussion 947.
Holmes RP, Assimos DG. Glyoxylate synthesis, and its modulation and influence on oxalate synthesis. J Urol. 1998;160:1617–24.
Simpson GL, Ortwerth BJ. The non-oxidative degradation of ascorbic acid at physiological conditions. Biochim Biophys Acta. 2000;1501:12–24.
Brzica H, Breljak D, Burckhardt BC, Burckhardt G, Sabolic I. Oxalate: from the environment to kidney stones. Arh Hig Rada Toksikol. 2013;64:609–30.
Chai W, Liebman M, Kynast-Gales S, Massey L. Oxalate absorption and endogenous oxalate synthesis from ascorbate in calcium oxalate stone formers and non-stone formers. Am J Kidney Dis. 2004;44:1060–9.
Siener R, Schade N, Nicolay C, von Unruh GE, Hesse A. The efficacy of dietary intervention on urinary risk factors for stone formation in recurrent calcium oxalate stone patients. J Urol. 2005;173:1601–5.
Verbruggen LA, Bourgain C, Verbeelen D. Late presentation and microcrystalline arthropathy in primary hyperoxaluria. Clin Exp Rheumatol. 1989;7:631–3.
Sarraf P, Kay J, Reginato AM. Non-crystalline and crystalline rheumatic disorders in chronic kidney disease. Curr Rheumatol Rep. 2008;10:235–48.
Castellaro AM, Tonda A, Cejas HH, Ferreyra H, Caputto BL, Pucci OA, Gil GA. Oxalate induces breast cancer. BMC Cancer. 2015;15:761.
Allison MJ, Dawson KA, Mayberry WR, Foss JG. Oxalobacter formigenes gen. nov., sp. nov.: oxalate-degrading anaerobes that inhabit the gastrointestinal tract. Arch Microbiol. 1985;141:1–7.
Miller AW, Dearing D. The metabolic and ecological interactions of oxalate-degrading bacteria in the mammalian gut. Pathogens. 2013;2:636–52.
Azcarate-Peril MA, Bruno-Barcena JM, Hassan HM, Klaenhammer TR. Transcriptional and functional analysis of oxalyl-coenzyme A (CoA) decarboxylase and formyl-CoA transferase genes from lactobacillus acidophilus. Appl Environ Microbiol. 2006;72:1891–9.
Arvans D, Jung YC, Antonopoulos D, Koval J, Granja I, Bashir M, Karrar E, Roy-Chowdhury J, Musch M, Asplin J, et al. Oxalobacter formigenes-derived bioactive factors stimulate oxalate transport by intestinal epithelial cells. J Am Soc Nephrol. 2017;28:876–87.
Sidhu H, Schmidt M, Cornelius J, Thamilselvan S, Khan S, Hesse A, Peck A. Direct correlation between hyperoxaluria/oxalate stone disease and the absence of the gastrointestinal tract-dwelling bacterium Oxalobacter formigenes: possible prevention by gut recolonization or enzyme replacement therapy. J Am Soc Nephrol. 1999;10:S334–40.
Duncan SH, Richardson AJ, Kaul P, Holmes RP, Allison MJ, Stewart CS. Oxalobacter formigenes and its potential role in human health. Appl Environ Microbiol. 2002;68:3841–7.
Stewart CS, Duncan SH, Cave DR. Oxalobacter formigenes and its role in oxalate metabolism in the human gut. FEMS Microbiol Lett. 2004;230:1–7.
Hoppe B, von Unruh G, Laube N, Hesse A, Sidhu H. Oxalate degrading bacteria: new treatment option for patients with primary and secondary hyperoxaluria? Urol Res. 2005;33:372–5.
Hatch M, Cornelius J, Allison M, Sidhu H, Peck A, Freel RW. Oxalobacter sp. reduces urinary oxalate excretion by promoting enteric oxalate secretion. Kidney Int. 2006;69:691–8.
Hatch M, Freel RW. A human strain of Oxalobacter (HC-1) promotes enteric oxalate secretion in the small intestine of mice and reduces urinary oxalate excretion. Urolithiasis. 2013;41:379–84.
Hoppe B, Groothoff JW, Hulton SA, Cochat P, Niaudet P, Kemper MJ, Deschenes G, Unwin R, Milliner D. Efficacy and safety of Oxalobacter formigenes to reduce urinary oxalate in primary hyperoxaluria. Nephrol Dial Transplant. 2011;26:3609–15.
Hoppe B, Niaudet P, Salomon R, Harambat J, Hulton SA, Van't Hoff W, Moochhala SH, Deschenes G, Lindner E, Sjogren A, Cochat P. A randomised phase I/II trial to evaluate the efficacy and safety of orally administered Oxalobacter formigenes to treat primary hyperoxaluria. Pediatr Nephrol. 2017;32:781–90.
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.
Hamady M, Lozupone C, Knight R. Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 2010;4:17–27.
DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006;72:5069–72.
Jensen NS, Allison MJ. Studies on the diversity among anaerobic oxalate-degrading bacteria now in the species Oxalobacter formi- genes. In Abstracts of the 94th General Meeting of the American Society for Microbiology 1994. Abstr. I-12. Washington, D.C: American Society for Microbiology; 1994. p. 255.
Sidhu H, Allison M, Peck A. Identification and classification of Oxalobacter formigenes strains by using oligonucleotide probes and primers. J Clin Microbiol. 1997;35:350–3.
The Greengenes database: Oxalobacter formigenes str. BLISS. http://greengenes.lbl.gov/cgi-bin/show_one_record_v2.pl?prokMSA_id=7366. Accessed 15 Apr 2017.
Oxalobacter formigenes str. BLISS 16S ribosomal RNA gene sequence at National Center for Biotechnology Information. 1998. https://www.ncbi.nlm.nih.gov/nuccore/U49750.1. Accessed 15 Apr 2017.
Calculate statistical significance of the overlap between two groups. http://nemates.org/MA/progs/overlap_stats.html. Accessed 15 Arp 2017.
Kaufman DW, Kelly JP, Curhan GC, Anderson TE, Dretler SP, Preminger GM, Cave DR. Oxalobacter formigenes may reduce the risk of calcium oxalate kidney stones. J Am Soc Nephrol. 2008;19:1197–203.
Kelly JP, Curhan GC, Cave DR, Anderson TE, Kaufman DW. Factors related to colonization with Oxalobacter formigenes in U.S. adults. J Endourol. 2011;25:673–9.
Mittal RD, Kumar R, Bid HK, Mittal B. Effect of antibiotics on Oxalobacter formigenes colonization of human gastrointestinal tract. J Endourol. 2005;19:102–6.
Agresti A. Foundations of linear and generalized linear models. Hoboken: John Wiley & Sons; 2015.
Preisser JS, Stamm JW, Long DL, Kincade ME. Review and recommendations for zero-inflated count regression modeling of dental caries indices in epidemiological studies. Caries Res. 2012;46:413–23.
Albert JM, Wang W, Nelson S. Estimating overall exposure effects for zero-inflated regression models with application to dental caries. Stat Methods Med Res. 2014;23:257–78.
The VioScreen questionnaire website. http://www.viocare.com/vioscreen.html. Accessed 27 Apr 2017.
Faith DP. Conservation evaluation and phylogenetic diversity. Biol Conserv. 1992;61:1–10.
Chao A. Nonparametric estimation of the number of classes in a population. Scand J Stat. 1984;11:265–70.
Lin J. Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory. 1991;37:145–51.
Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol. 2005;71:8228–35.
Bray JR, Curtis JT. An ordination of the upland forest communities of southern Wisconsin. Ecol Monogr. 1957;27:325–49.
Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol. 2015;11:e1004226.
Hakimi SL. On realizability of a set of integers as degrees of the vertices of a linear graph. I. J Soc Ind Appl Math. 1962;10:496–506.
Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977;40:35–41.
Mahana D, Trent CM, Kurtz ZD, Bokulich NA, Battaglia T, Chung J, Muller CL, Li H, Bonneau RA, Blaser MJ. Antibiotic perturbation of the murine gut microbiome enhances the adiposity, insulin resistance, and liver disease associated with high-fat diet. Genome Med. 2016;8:48.
Wu J, Mauricio B, Tan Y-J, Deng H-Z. Natural connectivity of complex networks. Chin Phys Lett. 2010;27:078902.
Barnett C, Nazzal L, Goldfarb DS, Blaser MJ. The presence of Oxalobacter formigenes in the microbiome of healthy young adults. J Urol. 2016;195:499–506.
Knight J, Deora R, Assimos DG, Holmes RP. The genetic composition of Oxalobacter formigenes and its relationship to colonization and calcium oxalate stone disease. Urolithiasis. 2013;41:187–96.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34:1–14.
Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the lasso. Ann Stat. 2006;34:1436–62.
Soucie JM, Thun MJ, Coates RJ, McClellan W, Austin H. Demographic and geographic variability of kidney stones in the United States. Kidney Int. 1994;46:893–9.
Furuta M, Ozaki M. Calcium oxalate crystals in the kidney and thyroid of leprosy patients. Int J Lepr Other Mycobact Dis. 1970;38:286–93.
Chaplin AJ. Some observations on the demonstration of calcium oxalate in tissue sections. Stain Technol. 1974;49:165–73.
Goldman M, Doering GJ. The effect of dietary ingestion of oxalic acid on thyroid function in male and female Long-Evans rats. Toxicol Appl Pharmacol. 1979;48:409–14.
Reid JD, Choi CH, Oldroyd NO. Calcium oxalate crystals in the thyroid. Their identification, prevalence, origin, and possible significance. Am J Clin Pathol. 1987;87:443–54.
Hackett RL, Khan SR. Presence of calcium oxalate crystals in the mammalian thyroid gland. Scanning Microsc. 1988;2:241–6.
Reid JD. Calcium oxalate in mammalian thyroids: a re-evaluation. J Comp Pathol. 1991;105:109–15.
Wahl R, Fuchs R, Kallee E. Oxalate in the human thyroid gland. Eur J Clin Chem Clin Biochem. 1993;31:559–65.
Katoh R, Kawaoi A, Muramatsu A, Hemmi A, Suzuki K. Birefringent (calcium oxalate) crystals in thyroid diseases. A clinicopathological study with possible implications for differential diagnosis. Am J Surg Pathol. 1993;17:698–705.
Katoh R, Suzuki K, Hemmi A, Kawaoi A. Nature and significance of calcium oxalate crystals in normal human thyroid gland. A clinicopathological and immunohistochemical study. Virchows Arch A Pathol Anat Histopathol. 1993;422:301–6.
Shimizu M, Hirokawa M, Kanahara T, Manabe T. Calcium oxalate crystals in thyroid fine needle aspiration cytology. Acta Cytol. 1999;43:575–8.
Pusztaszeri M. Birefringent crystals in thyroid fine-needle aspiration cytology. Diagn Cytopathol. 2016;44:814–5.
Clemente JC, Pehrsson EC, Blaser MJ, Sandhu K, Gao Z, Wang B, Magris M, Hidalgo G, Contreras M, Noya-Alarcón Ó, Lander O. The microbiome of uncontacted Amerindians. Science advances. 2015;1(3):e1500183.
Blaser MJ. Who are we? Indigenous microbes and the ecology of human diseases. EMBO Reports. 2006;7:956.
Yasui T, Iguchi M, Suzuki S, Kohri K. Prevalence and epidemiological characteristics of urolithiasis in Japan: national trends between 1965 and 2005. Urology. 2008;71:209–13.
Trinchieri A, Coppi F, Montanari E, Del Nero A, Zanetti G, Pisani E. Increase in the prevalence of symptomatic upper urinary tract stones during the last ten years. Eur Urol. 2000;37:23–5.
Hesse A, Brandle E, Wilbert D, Kohrmann KU, Alken P. Study on the prevalence and incidence of urolithiasis in Germany comparing the years 1979 vs. 2000. Eur Urol. 2003;44:709–13.
The American Gut Project sample collection instructions. https://www.microbio.me/AmericanGut/static/img/full_instructions.pdf. Accessed 12 Dec 2016.
Earth microbiome project 16S Illumina amplicon protocal. http://press.igsb.anl.gov/earthmicrobiome/protocols-and-standards/16s/. Accessed 12 Apr 2017.
The American Gut Project data analysis pipeline. http://nbviewer.jupyter.org/github/biocore/American-Gut/blob/master/ipynb/module2_v1.0.ipynb. Accessed 12 Dec 2016.
The American Gut Project sequencing reads. ftp://ftp.microbio.me/AmericanGut/ag-May-31-2016/01_raw/sequences.fna. Accessed 12 Dec 2016.
The American Gut Project closed-reference OTU-picking biom table. ftp://ftp.microbio.me/AmericanGut/ag-May-31-2016/03-otus/notrim/gg-13_8-97-percent/otu_table.biom. Accessed 12 Dec 2016.
The American Gut Project via Qiita. https://qiita.ucsd.edu/study/description/10317. Accessed 10 June 2017.
Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, Dufayard JF, Guindon S, Lefort V, Lescot M, et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;36:W465–9.
Dereeper A, Audic S, Claverie JM, Blanc G. BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evol Biol. 2010;10:8.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.
Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.
Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52:696–704.
Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol. 2006;55:539–52.
Chevenet F, Brun C, Banuls AL, Jacq B, Christen R. TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics. 2006;7:439.
Phylogeny.fr documentation. http://www.phylogeny.fr/documentation.cgi - oneclick. Accessed 17 May 2017.
Fitch WM. An improved method of testing for evolutionary homology. J Mol Biol. 1966;16:9–16.
Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. New York; Cambridge university press; 2007.
Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19:716–23.
Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6:461–4.
Aitchison J. A new approach to null correlations of proportions. Math Geol. 1981;13:175–89.
This work has utilized computing resources at the High-Performance Computing Facility at NYU Langone Medical Center. We thank Rob Knight and Embriete Hyde from AGP for the useful advice and thank Christian Mueller, Steven Daniels, Kelly Ruggles, Stuart Brown, and Aristotellis Tsirigos for their insights.
This study was supported in part by U01AI122285 and R01DK090989 and R01DK110014 from the National Institutes of Health and by the C & D, Belfer, and Knapp Family Funds and the Saul Farber scholarship. The authors also gratefully acknowledge support of the Rare Kidney Stone Consortium (U54KD083908), a part of NIH Rare Diseases Clinical Research Network (RDCRN), funded by the NIDDK and the National Center for Advancing Translational Sciences.
Availability of data and materials
The data used in the present study can be accessed through the Qiita website under study number 10317 [https://qiita.ucsd.edu/study/description/10317] and downloaded via the following link [https://qiita.ucsd.edu/download_study_bioms/10317]. The R scripts used for the multivariate analysis and the network analysis are included in Additional files 13, 14 and 15.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Workflow of American Gut Project data processing. The approaches of open- and closed-reference picking are shown, with the accompanying statistics on sequence numbers. (PDF 34 kb)
Phylogenetic analyses of O. formigenes strains and OTUs detected. Panel A. Maximum likelihood O. formigenes phylogenetic tree. The tree was built from full-length 16S rRNA gene sequences of O. formigenes strains [group I, purple; group II, green and strain BLISS (group unknown), in black; and O. vibrioformis selected from the Oxalobacter family], downloaded from Greengenes. The log-likelihood of tree is −3191.23. Branch support values are designated in red. Statistical details of the tree are included in the Additional files 11 and 12. Panel B. Sequence similarity matrix using 16S V4 region of 13 O. formigenes strains, O. vicrioformis, and 3 OTUs. Refer to the “Methods” section for the calculation of similarity. (PDF 600 kb)
Sequencing depth in relation to the number of O. formigenes OTUs present in a sample. Panels A, B. Panels focus on fecal samples with over 1000 (panel A) or 10,000 (panel B) sequences per sample, considering OTUs 7366 and 360508. Fold changes were calculated using the median sequencing depths of each group. (PDF 250 kb)
Number of study subjects, by the number of samples provided. (PDF 25 kb)
Prevalence of O. formigenes in 4945 AGP fecal samples, by locality of birth and current residence. Table S2. Inclusion criteria for individuals whose samples were used in the β-diversity (Fig. 5) analyses. (DOCX 19 kb)
Correlation between diet and O. formigenes relative abundance in 197 subjects. Panels focus on oxalate (left), calcium (middle), or oxalate/calcium (right). Dietary intake over the 90 days preceding sample collection was estimated through the Vioscreen questionnaire. *p value <0.05, by Spearman rank correlation. (PDF 96 kb)
α-Diversity measurements in 4945 fecal samples, by number of O. formigenes OTUs detected. The number of samples is indicated in parentheses. Rarefaction depths from 1000 to 10,000 seqs/sample are shown. All comparisons are significant at 10,000 seqs/sample, Bonferroni-corrected nonparametric two-sample t tests with 999 Monte Carlo permutations. (PDF 285 kb)
β-Diversity of 824 samples based on Bray–Curtis dissimilarities, by number of O. formigenes OTUs detected. Panels A–D. Visualization of β-diversity ordination through PCoA of all 824 samples meeting the inclusion criteria (Additional file 5: Table S2) (A), 604 US samples (B), 144 UK or Ireland samples (C), the rest of the 70 samples (D). Samples with 0, 1, or 2 O. formigenes OTUs are represented in blue, green, and red dots. Ellipses were drawn with ggplot2 stat_ellipse function using multivariate t-distribution. By Adonis test, *p value <0.05. Panels E–H. Bar plots (mean ± S.E.M) of intra- and intergroup pairwise sample distance by Bray–Curtis dissimilarities for all (E), 604 US (F), and 144 UK or Ireland (G) samples or for the remaining 70 samples (H). By Bonferroni-corrected t tests, *p < 0.05. (PDF 158 kb)
Alignment results of 16S rRNA gene V4 region sequence of 13 O. formigenes strains, 3 O. formigenes-associated OTUs, and O. vicrioformis. (TXT 14 kb)
Alignment results of full 16S rRNA gene sequences of 13 O. formigenes strains and O. vicrioformis. (FASTA 22 kb)
Statistical details of the phylogenetic tree in Additional file 2: Figure S2A. (FASTA 424 bytes)
R script for multivariate analysis. (R 33 kb)
R script for network analysis. (R 9 kb)