Skip to main content

Cardiorespiratory fitness as a predictor of intestinal microbial diversity and distinct metagenomic functions



Reduced microbial diversity in human intestines has been implicated in various conditions such as diabetes, colorectal cancer, and inflammatory bowel disease. The role of physical fitness in the context of human intestinal microbiota is currently not known. We used high-throughput sequencing to analyze fecal microbiota of 39 healthy participants with similar age, BMI, and diets but with varying cardiorespiratory fitness levels. Fecal short-chain fatty acids were analyzed using gas chromatography.


We showed that peak oxygen uptake (VO2peak), the gold standard measure of cardiorespiratory fitness, can account for more than 20 % of the variation in taxonomic richness, after accounting for all other factors, including diet. While VO2peak did not explain variation in beta diversity, it did play a significant role in explaining variation in the microbiomes’ predicted metagenomic functions, aligning positively with genes related to bacterial chemotaxis, motility, and fatty acid biosynthesis. These predicted functions were supported by measured increases in production of fecal butyrate, a short-chain fatty acid associated with improved gut health, amongst physically fit participants. We also identified increased abundances of key butyrate-producing taxa (Clostridiales, Roseburia, Lachnospiraceae, and Erysipelotrichaceae) amongst these individuals, which likely contributed to the observed increases in butyrate levels.


Results from this study show that cardiorespiratory fitness is correlated with increased microbial diversity in healthy humans and that the associated changes are anchored around a set of functional cores rather than specific taxa. The microbial profiles of fit individuals favor the production of butyrate. As increased microbiota diversity and butyrate production is associated with overall host health, our findings warrant the use of exercise prescription as an adjuvant therapy in combating dysbiosis-associated diseases.


The interactions between humans, their environment, and intestinal microbiota form a tripartite relationship that is fundamental to the physiological homeostasis and overall health of the host [1]. The human intestinal microbiota aids their host in several important biological functions such as digestion, absorption, stimulating immune responses, and protection against enteropathogens. The bacteria break down partially digested complex carbohydrates via fermentation and produce short-chain fatty acids (SCFAs) such as butyrate, acetate, and propionate as by-products. These SCFAs act as the primary food source of the colonocytes which consume up to 10 % of the dietary energy expenditure in humans. In particular, butyrate has been shown to play a critical role in overall gut homeostasis and health [2]. Lasting disturbances in the microbial community composition, termed dysbiosis, can have deleterious health effects in the host (reviewed in [3]). Gut microbiome diversity has emerged as a candidate indicator of overall host health. Low community richness has been correlated with metabolic markers such as adiposity, insulin resistance, and overall inflammatory phenotypes [4], as well as gastrointestinal (GI) conditions such as inflammatory bowel disease [5], Clostridium difficile infection [6], colorectal cancer [7], and irritable bowel syndrome [8]. As a result, considerable research in recent years has focused on understanding and developing strategies to promote overall GI health via community manipulation in attempt to resolve dysbiosis-associated diseases.

Various extrinsic variables such as stress, probiotic and antibiotic use, alcohol consumption, and diet have been identified as factors that can instigate changes in the microbiome [1, 9]. The link between physical activity and gut microbiota however is currently not well understood. Matsumoto et al. (2008) first identified increases in butyrate levels in cecum of physically active rats which they suggested was a result of compositional changes in butyrate-producing bacteria [10]. Evans et al. explored the effects of voluntary wheel running in mice fed with low- or high-fat diets and found that microbial communities clustered based on both diet and physical activity [11]. Allen et al. further showed that the mode of physical activity, for example, forced treadmill running versus volunteer wheel running, differently altered the microbiota [12]. Recently, Clarke et al. also found clustering of bacterial communities between professional rugby players and high/low body mass index (BMI) controls [13]. They further identified increases in bacterial community richness in these elite athletes compared to both control groups. In their study, however, extreme dietary differences, especially high protein intakes amongst the athletes, confounded interpretations regarding the specific role of physical activity and microbial changes.

To better isolate how physical fitness may moderate microbial diversity, we analyzed the fecal microbiota of individuals with varied fitness levels with comparable diets. We used peak oxygen uptake (VO2peak), the gold standard of cardiorespiratory fitness (CRF), as an indicator of physical fitness. We asked the questions (a) does taxonomical richness vary with CRF alone, (b) do abundances of particular taxa vary systematically in relation to variation in CRF, and (c) is this variation associated with functional pathways of the microbiome. We show that VO2peak, independent of diet, correlates with increased microbial diversity and production of fecal butyrate amongst physically fit participants.


Study design

Healthy young adults between 18 and 35 years old were recruited. Exclusion criteria included antibiotic treatment within the previous 6 months, current prescribed pharmaceutical drug utilization, or active acute or chronic diseases. All participants were verbally interviewed on their dietary habits and CRF was determined using a VO2peak cycle test. Participants were then provided a stool collection kit with instructions and were asked to provide a sample within a week following their lab visit.

Nutritional data collection

On the day of VO2peak testing, nutritional data, including supplements, was collected by means of a 24-h dietary recall interview and assessed by a research nutritionist using FoodWorks nutrient analysis software (version 16.0). Food items described by participants that were not available in the software were manually added as needed. A sample copy of a completed questionnaire is available in Additional file 1. On average, over 100 food categories per participant were produced by the FoodWorks software. A manual screening was applied to select categories of interest based on a priori interest and existing literature showing a significant interaction between those categories and intestinal microbiota. The selected 24 food category data are available in the uploaded metadata mapping file.

Cardiorespiratory fitness testing

Participants initially completed a physical activity readiness questionnaire (PAR-Q) to rule out any contraindications to vigorous exercise. A continuous incremental ramp maximal exercise test on an electronically braked cycle ergometer (Lode Excalibur, the Netherlands) was used to determine VO2peak and peak power output (Wpeak). Expired gas was collected continuously by a metabolic cart (Parvomedics TrueOne 2400, Salt Lake City, Utah, USA) calibrated with gases of known concentration. The test started at 50 W and increased by 30 W/min. The test was terminated upon volitional exhaustion or when revolutions per minute fell below 50. VO2peak was defined as the highest 30-s average for VO2 (in ml/kg/min). Criteria for achieving VO2peak were the following: (i) respiratory exchange ratio >1.15; (ii) plateau in VO2; (iii) reaching age-predicted HRpeak (220-age); and/or (iv) volitional exhaustion. Following VO2peak assessment, participants were categorized to either low (LOW), average (AVG), or high (HI) fitness based on their sex and age according to a modified Heyward normal VO2max reference chart (Additional file 2).

Stool collection and storage

Participants were provided with a home stool collection kit including a sterile 120 ml polypropylene container (Starplex, Etobicoke, Ontario), sterile tongue depressor and gloves, and an ice box. Participants were instructed to avoid alcohol for 3 days prior to stool collection. Stool samples were immediately stored in the participant’s freezer overnight and transported on ice to the lab and stored in −80 °C until further analysis. Frozen portions from the inner area of the samples were scrapped using sterile razor blades for DNA extraction and SCFA analysis.

SCFA analysis

SCFAs (acetic, propionic, heptanoic, valeric, caproic, and butyric acid) were analyzed from the feces by gas chromatography (GC) as described previously [14]. In brief, ~50 mg of stool was homogenized with isopropyl alcohol, containing 2-ethylbutyric acid at 0.01 % v/v as internal standard, at 30 Hz for 13 min using metal beads. Homogenates were centrifuged twice, and the cleared supernatant was injected to Trace 1300 Gas Chromatograph, equipped with Flame-ionization detector, with AI1310 auto sampler (Thermo Fisher Scientific) in splitless mode. Data was processed using Chromeleon 7 software. An aliquot of 50 mg of stool was freeze dried to measure the dry weight, and measurements are expressed as mass % (g of SCFA per g of dry weight stool).

High-throughput sequencing

DNA was extracted from feces using QIAmp DNA Stool Mini Kit (Qiagen) according to the manufacturer’s instructions following 3 × 30 s of homogenization using metal beads on a Retsch MixerMill MM 400 homogenizer. Sequencing libraries were prepared according to the Illumina MiSeq system instructions. In brief, the V3 and V4 region of the 16S bacterial rRNA gene was amplified using recommended primers [15] (IDT, Vancouver, Canada): Forward 5′ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG, and Reverse 5′ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC, which create amplicons of ~460 bp. Amplicons were cleaned using AMPure XP bead step, and then, adapters and dual-index barcodes (Nextera XT) were attached to the amplicons to facilitate multiplex sequencing. After another clean-up step, libraries were validated on an agarose gel, quantified, normalized, and sent to The Applied Genomic Core (TAGC) facility at the University of Alberta (Edmonton, Canada) for sequencing using the Illumina MiSeq platform. The resulting ~16,000,000 paired-end reads were merged using PEAR software [16] and screened to exclude sequences containing one or more base calls with a Phred score <20. The average read per sample was ~350,000 with a min/max of ~165,000/452,000 reads. Rarefaction curves demonstrated that sufficient sampling depth had been reached amongst all samples (Additional file 3).


Bioinformatics analyses on the demultiplexed paired reads were conducted using QIIME 1.8.0 software suites [17]. Reads were clustered at 97 % identity using the uclust method into operational taxonomic units (OTUs) then aligned to the most recent available version (2013/08) of Greengenes bacterial database [18]. Singleton and doubletons were removed, and the produced OTU table was normalized using phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt) [19] to adjust for different 16S rRNA gene copy numbers. Instead of rarefying our OTU table to the lowest sample depth [20], uneven variance as a result of differential sample sequencing depth was stabilized using the cumulative sum scaling (CSS) method [21] of “metagenomeSeq” package in R. Alpha diversity indexes, rarefaction curves, OTU tables, and distance metrics were also generated using QIIME.

Statistical analysis

All statistical analyses were performed using R [22] version 3.2.0 unless stated otherwise.

The groups’ age and VO2peak data were tested for normality using Shapiro-Wilk test, and a one-way analysis of variance (ANOVA) with Tukey’s multiple-comparison test was used to compare mean differences amongst groups. Kruskal-Wallis non-parametric test was used for comparing BMI as this dataset failed normality tests even after several transformation attempts. For comparison of dietary intake amongst groups, a permutational multivariate ANOVA (PERMANOVA) with 999 random permutations was used. Due to the inherent high variability of dietary data, we further searched for dietary patterns amongst groups by looking at a principal component analysis (PCA) plot of participants’ dietary scores using the ggbiplot package [23]. To facilitate comparisons with previous work, we first compared average alpha diversity amongst the three fitness categories using a one-way ANOVA, followed by a Tukey’s multiple comparison. To simultaneously evaluate the role of CRF alongside other potential predictors of alpha diversity (sex, age, BMI, and dietary components), we performed a multiple regression analysis. Given our comparatively low sample size (n = 39), and the general rule that multiple regressions should include at least 10 observations per predictor variable [24], we first screened potential predictors using a Spearman correlation matrix. Those that showed a significant correlation with alpha diversity were retained for entry in the multiple regression model. Multicolinearity was checked using the variable inflation factor (VIF) index with a maximum cutoff score of 10.

Microbial communities in fecal samples were ordinated using the Bray-Curtis and weighted and unweighted UniFrac distance metrics. Principal coordinate analysis (PCoA) based on the Bray-Curtis dissimilarity metric was conducted using the cmdscale function in the base “stats” package in R, while PCoA based on the weighted and unweighted unifrac distances was made using EMPeror tool [25]. Microbial communities were analyzed using two complementary multivariate approaches: (1) constrained ordination and (2) generalized linear models (GLM). For the constrained ordination approach, redundancy analysis (RDA) was used, which focuses on assemblage composition differences in relation to predictors of interest (VO2peak, sex, age, BMI, and dietary components). This was implemented using the “vegan” package [26] version 2.2-1 in R. Abundance data at each taxonomical resolution (phyla, class, order, family, and genus) were first Hellinger-transformed [27] to accommodate counts data with large occurrences of low and zero abundance. Variable selection in RDA was implemented using the ordistep function of vegan using both forward and backward stepwise inclusion. Predictors selected by this method at each classification level are presented in Additional file 4. To identify genera that significantly contributed to total variance, we evaluated Spearman correlations between genus abundance and the first two RDA axes. OTUs with a significant correlation coefficient (evaluated at Bonferroni adjusted alpha level) were displayed on the RDA plots with type II scaling. To evaluate the association of genus abundance with explanatory variables, we implemented multiple negative binomial GLMs using the “mvabund” package [28]. This multiple GLM method utilizes a series of univariate F tests of the effects of predictor variables on the abundance of each taxon. Regression assumptions were assessed using residual diagnostics. Taxa that made up less than 0.1 % of the total count and occurring in less than 75 % of samples were first removed (cf. [29]). Fifty taxa met the inclusion criteria and were included in the model. The default implementation of the multi-GLM method adjusts P values to account for multiple tests. Classification of relative abundance data according to the previously described enterotypes [30] was carried out using the Calinski-Harabasz (CH) index as described online (

Bacterial phylogeny is sufficiently linked to their functional capabilities and can be used to computationally predict the functional composition of the community metagenome [19]. The normalized genus abundance OTU table was used to predict the microbiome’s metagenomic functions using PICRUSt’s extended ancestral-state reconstruction approach. A new abundance matrix of predicted functional categories based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database was created. We constructed a biplot from the output of a PCA of functional category data and visually assessed clustering patterns based on CRF groupings. Next, to isolate the influence of specific predictor variables, an RDA was also performed using these functional categories as response variables and the same variables and selection methods described above.

Similarly, to determine the role of our exploratory variables in explaining variance in fecal SCFAs, an RDA was conducted using SCFA abundance data as the response variables (cf. [31]).


Diet was not a confounding factor across fitness groups

Twenty-two males and 19 females participated in the study. Two female participants were removed from sequencing analysis due to technical errors. Table 1 represents a summary of the 39 participants’ characteristics and dietary intake. Only one participant followed a vegetarian diet, and all 39 participants reported consuming dairy products (data not shown). Age distribution was similar across all groups. The LOW group had a marginally higher BMI (25.5, SD 3.9) compared to the AVG (23.5, SD .5) and HI (22.8, SD 1.5) groups; however, the difference was not statistically significant. BMI of AVG and HI groups falls within the “normal weight” range (18.5–24.9) as defined by Health Canada, while the LOW group is marginally above the “overweight” threshold of 25. The results of the PERMANOVA (Additional file 5) showed no main differences (permutation P = 0.56) across any nutritional classes based on fitness groups. PCA plot (Fig. 1) of dietary patterns amongst the different fitness groups also showed no discrete clusters, further supporting a lack of distinct dietary patterns amongst fitness groups.

Table 1 Summary of group characteristics and dietary intake
Fig. 1
figure 1

Dietary patterns amongst fitness groups. Scores of the two first components of the PCA of dietary data for all 39 subjects are presented. Each circle represents one participant, colored based on their CRF fitness levels. A lack of distinct clustering amongst groups suggests comparable dietary patterns amongst groups

CRF is correlated with increased microbial diversity

Species diversity of each participant (alpha diversity) was determined using several indexes: species richness (SR), chao1, Shannon, Simpson, and Faith’s phylogenetic diversity (PD). As all the alpha diversity indexes were highly correlated (Additional file 6), SR was chosen as a proxy in the regression model. After screening of potential predictors via Spearman correlation analysis, three variables were included in the multiple regression model: VO2peak, sex, and relative fat intake. Of these, only VO2peak was a significant predictor of alpha diversity (Table 2), with SR significantly (P = 0.011) associated with increasing VO2peak (R adj 2 = 0.204, coefficient estimate = 5.36, t = 2.17) (Fig. 2). Replacing SR with chao1, Shannon, and Simpson index in the regression model produced identical results in that VO2peak was the only significant predictor of these indices. Faith’s PD index showed a similar relationship with VO2peak (Pearson’s R = 0.30, P = 0.062); however, it did not reach statistical significance within the regression model (R adj 2 = 0.07, coefficient estimate = 0.14, t = 1.12).

Table 2 Multiple regression summary table of species richness data
Fig. 2
figure 2

Correlation between VO2peak and species richness (SR). Result of a multiple regression model showing a significant association between VO2peak and SR when holding all other variables constant. Shaded area represent 95 % confidence intervals

CRF levels do not promote distinct clustering of beta diversity data

Overall, 207 genera from 14 phyla were represented across all participants (Table 3). The HI group included representation from 173 genera, while the AVG and LOW groups were made up of 152 and 153, respectively. PCoA plots constructed using Bray-Curtis (Fig. 3); weighted and unweighted unifrac dissimilarity indices (Additional file 7) did not show group clustering based on fitness levels. Clustering of our dataset based on the CH index favored a two cluster partitioning (Additional file 8) rather than the proposed three enterotypes [30].

Table 3 Identified known taxa across fitness groups
Fig. 3
figure 3

Beta diversity amongst fitness groups. PCoA plot of genus abundance data based on Bray-Curtis dissimilarity measure shows no clear clustering when grouped according to CRF levels

Protein intake and age but not CRF explain overall community composition

The global RDA model which selected sex, age, and protein as meaningful explanatory variables was significant (P = 0.005) as assessed by Monte Carlo Permutation Procedure (MCPP) (1000 permutations). A total of 12.7 % of the overall variation in taxon composition was attributed to these explanatory variables, of which the majority were explained by the first and second axes (Fig. 4) which accounted for 7.9 and 2.3 % of the total variation, respectively. The RDA indicated that VO2peak did not significantly explain beta diversity at any taxonomic resolution, whereas total protein intake was significant at each resolution tested (Additional file 4). In addition, age, sex, and the omega6-omega3 ratio (n6:n3) were also marginally significant explanatory variables, though only at particular taxanomic resolutions. In Fig. 4b, we highlight 19 genera that were significantly correlated with one or both of the first two RDA axes. Amongst these, Bacteroides was strongly associated with protein intake along RDA2 while Odoribacter, Rikenellaceae, Oscillospira, and an unclassified RF39 were most strongly correlated with age along RDA1. Other genera that strongly aligned with RDA1, but not correlated with any explanatory variables, included Blautia and unclassified genera from Lachnospiraceae, Christensenellaceae, Ruminococcaceae, and Clostridiales.

Fig. 4
figure 4

Bacterial abundance RDA correlation biplots constrained by selected explanatory variables. The sites and explanatory variables (a) and genera (b) plots are presented separately for clarity; however, they are derived from the same RDA model, note the difference in axes scales. RDA1 and RDA2 which explain over 10 % of total variation in beta diversity are plotted. The global model’s P value was calculated using the Monte Carlo Permutation Procedure (MCPP). In plot A, subjects are color coded according to their CRF levels for illustrative purposes only as groupings were not included in the model. Black circles represent centroids for the categorical variable sex

CRF is associated with distinct microbiome functions rather than abundances of specific bacterial taxa

Results of our GLMs suggest that, overall, genus abundances vary significantly in relation to our exploratory variables (model P = 0.003) with VO2peak and sex identified as significant factors (P < 0.002). After stringent adjustments for multiple testing, the univariate follow-up tests revealed no significant response amongst the 50 individual taxa included. Without adjusting for multiple testing (and keeping in mind the increased potential for type-1 errors), several taxa exhibited positive relationship with VO2peak (P < 0.05). These include, Coprococcus, Roseburia, Adlercreutzia, and unknown members of Clostridiales, Lachnospiraceae, and Erysipelotrichaceae. We further explored whether the functional composition of the microbiomes were associated with CRF. Similar to the beta diversity analyses, no clear group clustering emerged based on CRF classification alone (Additional file 9). The RDA, however, showed that the variables VO2peak, sex, fiber, and sugar intake collectively had a marginally significant role in explaining compositional variation in functional categories (MCPP P = 0.063) (Additional file 4). Overall, 15.5 % of the total variation of the functional category composition was accounted for by these explanatory variables, of which 11 and 2.2 % were accounted for by the first and second axes, respectively (Fig. 5a, b). Of the 274 functional categories observed across all participants, we identified 65 significant categories. A complete list of the 65 identified functional categories is presented in Additional file 10. The RDA plots illustrate a pattern of VO2peak and fiber intake explaining variation amongst participants with high CRF levels. VO2peak was most strongly correlated with KEGG functional categories: sporulation, bacterial motility proteins including proteins involved in flagella assembly, and chemotaxis while negatively correlated with lipopolysaccharide (LPS) biosynthesis and LPS biosynthesis proteins. Total sugar intake was strongly correlated with the transporters, ABC transporters, and transcription factors while inversely associated with membrane and intracellular structural molecules and pore ion channels. Sex of participants did not play a significant role in any of the described parameters. Given the importance of SCFAs in gut health, we had a priori interest in “fatty acid biosynthesis” despite its exclusion from the RDA selection process. We found VO2peak to be positively correlated (P = 0.046, Spearman’s rho = 0.322) with fatty acid biosynthesis (Fig. 6). Thus, to understand which SCFAs correlated with VO2peak, we quantified fecal SCFAs via GC.

Fig. 5
figure 5

RDA correlation biplots of predicted metagenomics functions constrained by selected explanatory variables. The sites and explanatory variables (a) and genera (b) plots are presented separately for clarity; however, they are derived from the same RDA model, note the difference in axes scales. RDA1 and RDA2 which explain over 13 % of the total variation in data are plotted. The global model’s P value was calculated using the Monte Carlo Permutation Procedure (MCPP). In plot A, subjects are color coded according to their CRF for illustrative purposes only as groupings were not included in the model. Black circles represent centroids for the categorical variable sex

Fig. 6
figure 6

Correlation between VO2peak and fatty acid biosynthesis. Spearman correlation plot showing a positive relationship between VO2peak and the functional category “fatty acid biosynthesis.” rho Spearman’s correlation coefficient

CRF is positively correlated with fecal butyric acid

RDA triplot corresponding to fecal SCFAs as constrained by our exploratory variables is presented in Fig. 7. The global model selected sex, age, carbohydrate intake, and VO2peak as significant (MCPP P = 0.001) explanatory variables. Overall, 30.1 % of the total variation of SCFA data could be explained by these variables of which 17.9 and 11.9 % were accounted for by RDA1 and RDA2, respectively. Along RDA1, age was strongly positively correlated with valeric acid and to a lesser degree with hepatonoic and caproic acid, both which were strongly inversely correlated with carbohydrate intake. Along RDA2, VO2peak was strongly correlated with butyric acid which is represented mainly across HI and AVG fitness participants. Proprionic and acetic acid on the other hand were inversely correlated to VO2peak and were represented across an area with more LOW fitness participants. Sex of the participants as represented by centroids on the triplot did not play a major role in observed variance.

Fig. 7
figure 7

RDA correlation triplot of SCFA abundance data constrained by selected explanatory variables. RDA1 and RDA2 which explain over 29 % of the total variation in SCFA data are plotted. Subjects are color coded according to their CRF for illustrative purposes only as groupings were not included in the model. Black circles represent centroids for the categorical variable sex. The global model’s P value was calculated using the Monte Carlo Permutation Procedure (MCPP)


CRF is considered a better predictor of mortality than clinical variables including established risk factors such as smoking, diabetes, and hypertension [32, 33]. Its role as a possible indicator of intestinal microbial diversity, however, has not been investigated. Our regression model showed that ~20 % of variation in gut bacterial alpha diversity could be explained by VO2peak alone; in fact, VO2peak stood as the only variable that significantly contributed to increased alpha diversity. The primary findings from this study suggest that CRF is a good predictor of gut microbial diversity in healthy humans, outperforming several other variables including sex, age, BMI, and dietary components. Although no single bacterial taxon or group of taxa showed significant variation in abundance in relation to CRF levels, the overall function of the microbiome in high CRF individuals seems to favor an increase in chemotaxis-related genes and decreased LPS biosynthetic pathways. In addition, a strong positive correlation was observed between VO2peak and fecal butyric acid, a SCFA associated with gut health [2]. In support of this, when results from the multivariate GLMs were explored without adjustment for multiple testing, abundances of key butyrate-producing members from Clostridiales, Roseburia, Lachnospiraceae, and Erysipelotrichaceae genera were found to be significantly associated with VO2peak (P < 0.05). These results suggest an important role of these taxa in relation to increased butyrate production amongst more aerobically fit individuals; however, future studies should test these ideas under controlled settings.

A recent study by Clarke et al. showed increased gut community richness amongst professional rugby players compared to sedentary BMI-matched and non-matched populations [13]. Due to extreme dietary differences amongst their groups, however, the contribution of physical fitness could not be isolated from possible diet-driven influences. For example, it has been shown that increased species richness as a result of voluntary wheel running in mice is only robust under high-fat but not low-fat feeding conditions [11], highlighting the importance of the background diet. In our study, we minimized the potential influence of diet as a confounding factor by examining LOW, AVG, and HI fitness participants with no significant differences in a comprehensive number of dietary variables. In addition, we quantify fitness using VO2peak, a measure of capacity for aerobic work and the gold standard of CRF. In their study, Clarke et al. highlighted the importance of protein intake by showing its positive correlation with alpha diversity. Interestingly, the magnitude of this correlation was comparable to our correlation coefficient between VO2peak and alpha diversity in the absence of a correlation between protein intake and alpha diversity. This may suggest that the reported correlation between protein intake and alpha diversity may have been a secondary product of increased CRF amongst the elite athletes. The mechanisms by which physical activity may promote a rich bacterial community are not known but likely involve a combination of intrinsic and extrinsic factors. For example, physically active individuals are more likely to be exposed to their environmental biosphere and follow an overall healthy lifestyle and as so harbor a richer microbiota. Simultaneously, intrinsic adaptations to endurance training can lead to changes in the GI tract, for example, decreased blood flow, tissue hypoxia, and increased transit and absorptive capacity [34, 35]. These and other potential adaptation mechanisms such as change in gut pH may create an environmental setting allowing for richer community diversity.

Beta diversity analysis of our cohort did not show distinct clustering of bacterial communities based on fitness categories. This contrasts with previous reports [11], which showed distinct clustering resulting from wheel running in mice, as well as those by Clarke et al. who showed clustering of rugby players’ microbiota [13]. In addition to extreme dietary differences, several mechanisms may explain these discrepancies. Community clustering amongst cohabited animals or the “cage-effect” is known to show high community structure concordance [36, 37]; it is therefore plausible that this phenomena extends to humans. As team members are likely to spend extended periods of time together on and off the field, there is an increased likelihood of microbial exchange leading to distinct similar bacterial profiles. Participants in the current study on the other hand did not belong to a common organization and did not show any detectable dietary differences. Other components of fitness not accounted for in the current study such as anaerobic capacity and resistance muscle training may also influence community composition, though to date, no existing work has examined these parameters in relation to gut microbiota.

Total protein intake was consistently seen as a significant contributor to beta diversity at each taxonomic rank tested, while sex and age were only influential beyond the phyla level. Unlike dietary carbohydrates and fats, which are commonly studied, the role of protein in the context of intestinal microbiota is considerably less understood. Protein-rich diets have been associated with prevalence of Bacteroides genus [38]. Echoing this, results from our RDA analysis showed a strong correlation between protein intake and Bacteroides without bias towards any specific fitness groups. Excessive fermentation of dietary protein in the GI tract is generally considered detrimental due to the production of toxic by-products such as amines, phenols, indoles, thiols, and ammonia [39, 40]. Further research however is needed to determine the synthesis kinetics and clinical consequence of these by-products during increased nutritional status and metabolic demands such as during prolonged exercise training. The RDA results further showed significant contribution of members of the Ruminococcaceae and Lachnospiraceae, two of the most abundant families in gut environments [41], in explaining community diversity. These plant degraders persist in fibrolytic gut communities and are considered an important component of a healthy gut, while their depletion has been observed in IBD patients [42, 43]. Ruminococcaceae and Bacteroides were anticorrelated, likely reflecting the persistence of these groups in plant carbohydrate- versus protein-rich gut environments, respectively. Interestingly, an unclassified member of the Christensenellaceae family was seen significantly correlated with age; this was true despite the limited range of our participants’ age (18–35 years). Though there is limited published work regarding its role, a recent study identified Christensenellaceae as the most heritable member of the gut microbiota and highlighted their role in promoting a lean phenotyope [44].

An increase in CRF demands various phenotypic and metabolic adaptations by the host which subsequently may require adaptation by the commensal bacteria. The results of our RDA showed that although VO2peak was not significantly associated with variation in community composition, it was associated with changes in the metagenomic functions of the microbiome. Functional categories most strongly correlated with VO2peak were related to bacterial motility (categories: bacterial motility proteins, flagella assembly, and bacterial chemotaxis), sporulation, and to a lesser extend the two-component system which enables bacterial communities to sense and respond to environmental factors. One possible mechanism behind these associations may derive from the observation that butyrate, which was more abundant amongst fit participants, can modulate neutrophil chemotaxis [45, 46]. VO2peak was inversely correlated with LPS biosynthesis and LPS biosynthesis proteins which were more aligned amongst less fit participants. LPS is a major component of the cell wall of gram-negative bacteria and is considered an endotoxin when present in the blood. By binding to extracellular toll-like receptor 4 (TLR4) found on many cell types, LPS elicits strong inflammatory responses that may be detrimental to the host. Continuous low-level translocation of LPS into circulation can induce chronic low-level inflammatory states that are associated with development of obesity and other metabolic syndromes [47]. These inflammatory states are thought to derive to some extent from inflammatory responses to blood LPS which is elevated in sedentary humans [48]. Exercise training attenuates inflammation in part by reducing elevated blood LPS [48]. The inverse relationship between VO2peak and LPS biosynthesis pathways observed in the current study therefore extends previous research, suggesting a beneficial consequence of increased physical activity to derive from decreased LPS biosynthesis. The findings here suggest that the gut microbiota adapt to metabolic demands of a physically active lifestyle, anchored around a set of physiological functions.

Production of SCFAs is the primary result of carbohydrate fermentation under anaerobic conditions in the gut. Butyric acid or butyrate is the most commonly studied of these SCFAs in regard to intestinal health. As the primary food source of colonocytes, butyrate plays an important role in gut homeostasis and health. It has been shown to possess anticancer and anti-inflammatory properties [49] and be involved in gut motility [50, 51], energy expenditure [52], intestinal permeability [53], and appetite control [54], while a decrease in butyrate levels has been suggested in etiology of ulcerative colitis [55]. We observed a strong positive correlation between VO2peak and fecal butyrate levels, which could not be accounted for by ingested dietary butyrate or its substrate, fiber. This suggests that the microbial profiles of physically fit individuals favor butyrate producing taxa leading to increased fecal butyrate. This is in accordance with Matsumoto et al. who observed increases in butyrate levels in cecum of rats exposed to 5 weeks of wheel running [10].


The primary findings from this correlative study suggest that gut microbial diversity in healthy humans is associated with aerobic fitness and that dietary protein moderates microbial community composition. They further suggest that adaptation of the microbiota to demands of increasing physical fitness is anchored around a set of functional cores rather than specific bacterial groups. In particular, the microbiome profile of fit individuals appears to favor butyrate production, a common indicator of gut health, potentially through increases in Clostridiales, Roseburia, Lachnospiraceae, and Erysipelotrichaceae genera. Overall, our findings are consistent with a role for physical activity in promoting gut intestinal health via associated changes in the microbial community composition. Based on these findings, we encourage further research on the use of aerobic exercise prescription as an adjuvant therapy in prevention and treatment of dysbiosis-associated diseases.


BMI, body mass index; CRF, cardiorespiratory fitness; LPS, lipopolysaccharide; OTU, operational taxa unit; PCA, principal component analysis; RDA, redundancy analysis; SCFAs, short-chain fatty acids; VO2peak, peak oxygen uptake


  1. Sekirov I, Russell SL, Antunes LCM, Finlay BB. Gut microbiota in health and disease. Physiol Rev. 2010;90:859–904.

    Article  CAS  PubMed  Google Scholar 

  2. Leonel AJ, Alvarez-Leite JI. Butyrate: implications for intestinal function. Curr Opin Clin Nutr Metab Care. 2012;15:474–9.

    Article  CAS  PubMed  Google Scholar 

  3. Chan YK, Estaki M, Gibson DL. Clinical consequences of diet-induced dysbiosis. Ann Nutr Metab. 2013;63 Suppl 2:28–40.

    Article  CAS  PubMed  Google Scholar 

  4. Le Chatelier E, Nielsen T, Qin J, Prifti E, Hildebrand F, Falony G, Almeida M, Arumugam M, Batto JM, Kennedy S, et al. Richness of human gut microbiome correlates with metabolic markers. Nature. 2013;500:541–6.

    Article  PubMed  Google Scholar 

  5. Ott SJ, Musfeldt M, Wenderoth DF, Hampe J, Brant O, Folsch UR, Timmis KN, Schreiber S. Reduction in diversity of the colonic mucosa associated bacterial microflora in patients with active inflammatory bowel disease. Gut. 2004;53:685–93.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Chang JY, Antonopoulos DA, Kalra A, Tonelli A, Khalife WT, Schmidt TM, Young VB. Decreased diversity of the fecal microbiome in recurrent Clostridium difficile-associated diarrhea. J Infect Dis. 2008;197:435–8.

    Article  PubMed  Google Scholar 

  7. Ahn J, Sinha R, Pei Z, Dominianni C, Wu J, Shi J, Goedert JJ, Hayes RB, Yang L. Human gut microbiome and risk for colorectal cancer. J Natl Cancer Inst. 2013;105:1907–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Giamarellos-Bourboulis E, Tang J, Pyleris E, Pistiki A, Barbatzas C, Brown J, Lee CC, Harkins TT, Kim G, Weitsman S, et al. Molecular assessment of differences in the duodenal microbiome in subjects with irritable bowel syndrome. Scand J Gastroenterol. 2015;50:1–12.

    Article  Google Scholar 

  9. Sommer F, Backhed F. The gut microbiota—masters of host development and physiology. Nat Rev Microbiol. 2013;11:227–38.

    Article  CAS  PubMed  Google Scholar 

  10. Matsumoto M, Inoue R, Tsukahara T, Ushida K, Chiji H, Matsubara N, Hara H. Voluntary running exercise alters microbiota composition and increases n-butyrate concentration in the rat cecum. Biosci Biotechnol Biochem. 2008;72:572–6.

    Article  CAS  PubMed  Google Scholar 

  11. Evans CC, LePard KJ, Kwak JW, Stancukas MC, Laskowski S, Dougherty J, Moulton L, Glawe A, Wang Y, Leone V, et al. Exercise prevents weight gain and alters the gut microbiota in a mouse model of high fat diet-induced obesity. PLoS One. 2014;9:e92193.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Allen JM, Berg Miller ME, Pence BD, Whitlock K, Nehra V, Gaskins HR, White BA, Fryer JD, Woods JA. Voluntary and forced exercise differentially alters the gut microbiome in C57BL/6J mice. J Appl Physiol. 2015;118(1985):1059–66.

    Article  CAS  PubMed  Google Scholar 

  13. Clarke SF, Murphy EF, O'Sullivan O, Lucey AJ, Humphreys M, Hogan A, Hayes P, O'Reilly M, Jeffery IB, Wood-Martin R, et al. Exercise and associated dietary extremes impact on gut microbial diversity. Gut. 2014;63:1913–20.

    Article  CAS  PubMed  Google Scholar 

  14. Brown K, Godovannyi A, Ma C, Zhang Y, Ahmadi-Vand Z, Dai C, Gorzelak MA, Chan Y, Chan JM, Lochner A, et al. Prolonged antibiotic treatment induces a diabetogenic intestinal microbiome that accelerates diabetes in NOD mice. ISME J. 2015.

  15. Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, Glockner FO. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41:e1.

    Article  CAS  PubMed  Google Scholar 

  16. Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: a fast and accurate Illumina paired-end reAd mergeR. Bioinformatics. 2014;30:614–20.

    Article  CAS  PubMed  Google Scholar 

  17. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, Andersen GL, Knight R, Hugenholtz P. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2012;6:610–8.

    Article  CAS  PubMed  Google Scholar 

  19. Langille MG, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Vega Thurber RL, Knight R, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31:814–21.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol. 2014;10:e1003531.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Paulson JN, Stine OC, Bravo HC, Pop M. Differential abundance analysis for microbial marker-gene surveys. Nat Methods. 2013;10:1200–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2016.

  23. Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer; 2009.

    Book  Google Scholar 

  24. McNamee R. Regression modelling and other methods to control confounding. Occup Environ Med. 2005;62:500–6. 472.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Vazquez-Baeza Y, Pirrung M, Gonzalez A, Knight R. EMPeror: a tool for visualizing high-throughput microbial community data. Gigascience. 2013;2:16.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O'Hara RB, Simpson GL, Solymos P, Stevens MHH, Wagner H. Vegan: Community ecology package. R package version 2.3-5. 2016. [].

  27. Legendre P, Gallagher ED. Ecologically meaningful transformations for ordination of species data. Oecologia. 2001;129:271–80.

    Article  Google Scholar 

  28. Wang Y, Naumann U, Wright S, Eddelbuettel D, Warton D. mvabund: Statistical methods for analysing multivariate abundance data. R package version 3.11.9. 2016. [].

  29. Nielsen S, Needham B, Leach ST, Day AS, Jaffe A, Thomas T, Ooi CY. Disrupted progression of the intestinal microbiota with age in children with cystic fibrosis. Sci Rep. 2016;6:24857.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Fernandes GR, Tap J, Bruls T, Batto JM, et al. Enterotypes of the human gut microbiome. Nature. 2011;473:174–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. You YM, Wang J, Huang XM, Tang ZX, Liu SR, Sun OJ. Relating microbial community structure to functioning in forest soil organic carbon transformation and turnover. Ecol Evol. 2014;4:633–47.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Myers J, Prakash M, Froelicher V, Do D, Partington S, Atwood JE. Exercise capacity and mortality among men referred for exercise testing. N Engl J Med. 2002;346:793–801.

    Article  PubMed  Google Scholar 

  33. Kodama S, Saito K, Tanaka S, Maki M, Yachi Y, Asumi M, Sugawara A, Totsuka K, Shimano H, Ohashi Y, et al. Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: a meta-analysis. JAMA. 2009;301:2024–35.

    Article  CAS  PubMed  Google Scholar 

  34. Rosa EF, Silva AC, Ihara SS, Mora OA, Aboulafia J, Nouailhetas VL. Habitual exercise program protects murine intestinal, skeletal, and cardiac muscles against aging. J Appl Physiol. 2005;99(1985):1569–75.

    Article  PubMed  Google Scholar 

  35. Gisolfi CV. Is the GI system built for exercise? News Physiol Sci. 2000;15:114–9.

    PubMed  Google Scholar 

  36. Lees H, Swann J, Poucher SM, Nicholson JK, Holmes E, Wilson ID, Marchesi JR. Age and microenvironment outweigh genetic influence on the Zucker rat microbiome. PLoS One. 2014;9:e100916.

    Article  PubMed  PubMed Central  Google Scholar 

  37. McCafferty J, Muhlbauer M, Gharaibeh RZ, Arthur JC, Perez-Chanona E, Sha W, Jobin C, Fodor AA. Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model. ISME J. 2013;7:2116–25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Wu GD, Chen J, Hoffmann C, Bittinger K, Chen YY, Keilbaugh SA, Bewtra M, Knights D, Walters WA, Knight R, et al. Linking long-term dietary patterns with gut microbial enterotypes. Science. 2011;334:105–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Macfarlane GT, Macfarlane S. Bacteria, colonic fermentation, and gastrointestinal health. J AOAC Int. 2012;95:50–60.

    Article  CAS  PubMed  Google Scholar 

  40. Rist VT, Weiss E, Eklund M, Mosenthin R. Impact of dietary protein on microbiota composition and activity in the gastrointestinal tract of piglets in relation to gut health: a review. Animal. 2013;7:1067–78.

    Article  CAS  PubMed  Google Scholar 

  41. Jalanka-Tuovinen J, Salonen A, Nikkila J, Immonen O, Kekkonen R, Lahti L, Palva A, de Vos WM. Intestinal microbiota in healthy adults: temporal analysis reveals individual and common core and relation to intestinal symptoms. PLoS One. 2011;6:e23035.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Fujimoto T, Imaeda H, Takahashi K, Kasumi E, Bamba S, Fujiyama Y, Andoh A. Decreased abundance of Faecalibacterium prausnitzii in the gut microbiota of Crohn’s disease. J Gastroenterol Hepatol. 2013;28:613–9.

    Article  CAS  PubMed  Google Scholar 

  43. Frank DN, St Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci U S A. 2007;104:13780–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Goodrich JK, Waters JL, Poole AC, Sutter JL, Koren O, Blekhman R, Beaumont M, Van Treuren W, Knight R, Bell JT, et al. Human genetics shape the gut microbiome. Cell. 2014;159:789–99.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Vinolo MA, Ferguson GJ, Kulkarni S, Damoulakis G, Anderson K, Bohlooly YM, Stephens L, Hawkins PT, Curi R. SCFAs induce mouse neutrophil chemotaxis through the GPR43 receptor. PLoS One. 2011;6:e21205.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Bocker U, Nebe T, Herweck F, Holt L, Panja A, Jobin C, Rossol S, BS R, Singer MV. Butyrate modulates intestinal epithelial cell-mediated neutrophil migration. Clin Exp Immunol. 2003;131:53–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Monteiro R, Azevedo I. Chronic inflammation in obesity and the metabolic syndrome. Mediators Inflamm 2010, 2010. doi:10.1155/2010/289645.

  48. Lira FS, Rosa JC, Pimentel GD, Souza HA, Caperuto EC, Carnevali LC, Seelaender M, Damaso AR, Oyama LM, de Mello MT, Santos RV. Endotoxin levels correlate positively with a sedentary lifestyle and negatively with highly trained subjects. Lipids Health Dis. 2010;9:82.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Hamer HM, Jonkers D, Venema K, Vanhoutvin S, Troost FJ, Brummer RJ. Review article: the role of butyrate on colonic function. Aliment Pharmacol Ther. 2008;27:104–19.

    Article  CAS  PubMed  Google Scholar 

  50. Scheppach W. Effects of short chain fatty acids on gut morphology and function. Gut. 1994;35:S35–38.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Hurst NR, Kendig DM, Murthy KS, Grider JR. The short chain fatty acids, butyrate and propionate, have differential effects on the motility of the guinea pig colon. Neurogastroenterol Motil. 2014;26:1586–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Gao Z, Yin J, Zhang J, Ward RE, Martin RJ, Lefevre M, Cefalu WT, Ye J. Butyrate improves insulin sensitivity and increases energy expenditure in mice. Diabetes. 2009;58:1509–17.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kanauchi O, Iwanaga T, Mitsuyama K, Saiki T, Tsuruta O, Noguchi K, Toyonaga A. Butyrate from bacterial fermentation of germinated barley foodstuff preserves intestinal barrier function in experimental colitis in the rat model. J Gastroenterol Hepatol. 1999;14:880–8.

    Article  CAS  PubMed  Google Scholar 

  54. Sleeth ML, Thompson EL, Ford HE, Zac-Varghese SE, Frost G. Free fatty acid receptor 2 and nutrient sensing: a proposed role for fibre, fermentable carbohydrates and short-chain fatty acids in appetite regulation. Nutr Res Rev. 2010;23:135–45.

    Article  CAS  PubMed  Google Scholar 

  55. Kumari R, Ahuja V, Paul J. Fluctuations in butyrate-producing bacteria in ulcerative colitis patients of North India. World J Gastroenterol. 2013;19:3404–14.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank Wade Klaver (UBCO) for the computer technical support and Juan Jovel (U of A) for the data analysis support.


ME was supported by a PGS-D through the Natural Science and Engineering Research Council (NSERC). SKG was supported by a USRA from NSERC. JP is supported financially by an NSERC Discovery Grant. JPL is a Canadian Institutes of Health Research (CIHR) New Investigator and is supported by an NSERC Discovery Grant. SG is a current MSFHR scholar and supported through grants from Dairy Farmers of Canada and the Canadian Diabetes Association. This work was supported by grants funded through NSERC to D.L.G.

Availability of data and materials

All sequence reads and associated metadata files are available from the Sequence Read Archive (accession number#: SRP068480, R scripts are available from the corresponding author on reasonable request.

Authors’ contributions

ME was involved in all aspects of the design, experimentation, data collection, analysis, and manuscript writing, editing, and submission; JP is the primary advisor for community analysis, statistical analyses, and critical edits and review of manuscript; PB and KRM performed all exercise lab testing, participants interviews, and human data collection; SKG performed GC analysis; SG supervised and funded GC analysis; JPL supervised all aspects of exercise lab including participant recruitment, training, ethical approval acquisition, and critical review of manuscript; ZAV performed the food data analysis; DLG supervised all aspects of the study including design, experimentation, data collection, critical review, and editing of manuscript and funded the project. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

This study was conducted according to the Declaration of Helsinki guidelines, and all procedures were approved by University of British Columbia Clinical Research Ethics Board (Reference ID: H14-00482).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Deanna L. Gibson.

Additional files

Additional file 1:

Sample food survey. A detailed description of all foods and supplemented consumed by the subjects was written during the interview and later analyzed using FoodWorks nutrient analysis software (Version16.0) by a research nutritionist. (PDF 162 kb)

Additional file 2:

Heyward’s 2006 normal VO2max reference chart. Subjects characterized as “Superior” or “Excellent” according to the Heyward classification were grouped under the “HI” group, “Fair” and “Good” subjects were placed into the “AVG” group, and “Poor” was renamed to “LO.” (TIF 2121 kb)

Additional file 3:

Sampling depth rarefaction curves. Rarefaction curves of all subjects at 97 % similarity levels shown as a function of Shannon diversity index and number of sequence tags sampled. (TIF 823 kb)

Additional file 4:

Predictor variables included in the RDA models. A manual pre-screening of dietary variables based on existing literature and categories of interest was initially carried. Next, a combination of “both” forward and backward stepwise inclusion selection method using vegan’s ordistep function was used on the remaining 23 variables plus VO2peak, Sex, BMI, and Age. (TIF 240 kb)

Additional file 5:

Table summary of PERMANOVA for dietary intake amongst different fitness groups. df degrees of freedom, SS sum of squares, MS mean of squares, Pr P value as computed by 999 permutations. (TIF 101 kb)

Additional file 6:

Correlation matrix of various alpha diversity matrices. A correlation matrix using Spearman’s r showing strong correlation between all alpha diversity matrices used. Species richness (S) was thus used as a proxy for the response variable in the multiple regression model. (TIFF 11074 kb)

Additional file 7:

Beta diversity amongst fitness groups. Three dimensional PCoA plots of genus abundance data transformed with weighted (A) and unweighted (B) unifrac dissimilarity matrices show no clear clustering based on CRF levels. (TIF 256 kb)

Additional file 8:

Optimal clustering selection of bacterial data. The number of optimal clustering of all data was determined using the Calinski-Harabasz (CH) index. Optimal number of clusters did not identify the classical three enterotypes but rather favored a two cluster partitioning. (TIFF 11074 kb)

Additional file 9:

Ordination of predicted metagenomic functions data. PCA plot of centered functional category abundance data showing no clear clustering of groups based on their CRF levels. Plots were created using Statistical Analysis of Metagenomic Profiles (STAMP) tool. (TIF 398 kb)

Additional file 10:

Significant functional categories included in RDA model. A complete list of predicted functional categories and their corresponding RDA 1–4 coordinates determined to be significant in our RDA model. Using a series of Spearman correlations between each category abundance data and RDA1 then RDA2 (alpha adjusted using Bonferroni correction), we identified 65 significant categories out of a total of 274. (DOCX 22 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Estaki, M., Pither, J., Baumeister, P. et al. Cardiorespiratory fitness as a predictor of intestinal microbial diversity and distinct metagenomic functions. Microbiome 4, 42 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: