A dog gut microbiome gene catalog
A total of 129 dog stool samples were collected from 64 dogs (32 Labrador retrievers and 32 beagles; see Additional file 1: Table S1 for physical characteristics of the study cohort), with two samples from each dog (except for a single case where three samples were collected from the same dog). DNA was extracted and Illumina-sequenced in pair-end mode (125 bases per read). Each metagenome contains an average of 117 million paired-end reads (s.d. 32 million), leading to a total of 1.9 terabasepairs over all samples (Fig. 1a and Additional file 2: Table S2). Following previously developed approaches [27], we assembled the metagenomic reads from each sample into contigs, predicted genes on these contigs, and, finally, clustered the predicted genes from all samples into a non-redundant gene catalog (see Fig. 1a; “Methods” section; [28]). This catalog contains 1,247,405 non-redundant (at 95% nucleotide sequence identity) coding sequences, of which 630,230 (50.5%) are complete genes with an average size of 884 base pairs, compared to an average of 571 base pairs for incomplete ones.
As many as 97% of the reads can be recruited back to the catalog, indicating that the catalog already captures almost all of the genomic content in these samples (Fig. 1e). Two published dog metagenomes [29] from pooled dog fecal samples of six hound-cross dogs, sequenced using 454 technology with only ca. 500,000 reads each, were used to assess the generality of this catalog beyond the study cohort. When mapping these against our catalog using the same identity cutoff as for the catalog generation (95%), we were able to recruit 90.4 and 92.4% of reads to our catalog, for the two metagenomes, respectively. This implies that our catalog already contains most of the genomic content of the gut microbiome of dogs in a Western pet care center.
Taxonomic annotation (see the “Methods” section) showed that the dog gut microbiome gene catalog is predominantly composed of five phyla: Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, and Fusobacteria, with the first two contributing more than half the detected genes (Fig. 1c).
A comparison of the dog gut gene catalog with those from other mammals
We compared our gene catalog with three previously published gut microbial gene catalogs: from human [28], pig [7], and mouse [30] hosts, which had been built based on similar (Illumina sequencing) data with similar computational procedures. We applied the same taxonomic annotation to all catalogs (see the “Methods” section). The phylum-level distribution of genes in the dog gut is most similar to that of the human gut catalog, although we observe a higher proportion of genes from Fusobacteria (Fig. 1c). The mouse catalog contains the largest fraction of Firmicutes genes among the four species considered, while the pig catalog has a higher fraction of genes which cannot be annotated (Additional file 7).
As this analysis does not account for differences in abundance of genes or microbes, we compared the microbiomes using genus-level relative abundances (using abundance-weighted Jaccard as the basis for an ordination, Fig. 1d). The dog gut microbiome is closer to that of humans than the other non-human microbiomes (all pairwise comparisons are statistically significant, p values below computational precision limits, two-tailed Mann-Whitney-Wilcoxon test; see Additional file 8: Figure S2).
To further quantify the overlap of the three animal gut microbiomes with that of the human, we recruited short sequencing reads from each host-associated gut microbiome to the human gut gene catalog [28], accounting for gene differential abundance (Fig. 1e). As expected, human reads from the MetaHIT [31] and the HMP projects [32] mapped at the highest rate to the human catalog. Among the animal microbiomes, a much larger fraction of dog reads map to the human catalog than is the case for pigs: 63% of dog reads could be mapped to the human catalog, compared to only 32.9% of pig and 19.9% of mouse reads. When mapping human reads to the animal catalogs, 28% of reads can be mapped to the dog catalog, just slightly more than the fraction that can be mapped to the pig catalog, 27.2%. A lower rate, 22.5%, maps to the mouse catalog (Additional file 9: Figure S3).
To evaluate the overlap between the gene catalogs, we clustered all the catalogs together using the same parameters as were used when building the catalogs (see Fig. 1d). The dog gut gene pool overlaps most with the human microbiome (309,232 out of 1,247,405, circa 26%) and the murine one least (122,131 out of 2,487,431; 4.9%), with the pig catalog in-between (797,746 out of 7,238,249; 11.0%), the latter very similar to a previous report [7]. These conclusions are robust to removing low abundance genes or equalizing the number of genes by random sampling (see Additional file 10: Figure S4). Due to its larger size (9,780,814 genes), the human catalog overlaps with the animal microbiomes at much lower rates, namely 3.2% for dogs, 8.2% for pigs, and only 1.2% in the case of mouse.
The four-way intersection contains only a small number of genes (7513 out of a total of 21,385,247 genes considered). This suggests that although there are similar bacteria at the genus and even species level (Additional file 3: Table S3), most strains harbor host-specific genes [33]. To test this hypothesis and to ensure that the similarity of the dog and human microbiomes were not due to direct transmission of microbes from human to dogs, we confirmed the host specificity of strains by profiling single-nucleotide polymorphisms (SNPs) for species present in our dog samples and in publicly available human microbiome samples. Among the species with high enough coverage using default metaSNV parameters [34], only for a single species, Bacteroides sp. D20 was a minimal overlap in SNP space observed between any human and dog strains, due to a single dog sample (the two most abundant species shown in Fig. 1f; for all six species that could be reliably profiled given the depth of sequencing, see Additional file 11: Figure S5). Thus, we conclude that persistent sharing of microbial strains between hosts of a different species is a rare event.
These different analyses consistently show that, of the three animal gut microbiomes considered, the mouse gut microbiome (the current go-to model system) is the least similar to the human gut microbiome of the three non-human animals studied. When comparing pig and dog gut microbiomes to the human one, considering in particular the analyses that are robust to the presence in the catalog of rare and low abundance genes (by taking the abundances into account), we conclude that, overall, the dog gut microbiome has a higher taxonomic and functional overlap with the human gut microbiome. As microbial gut strains are host-specific, this similarity cannot be explained solely by direct transmission between dogs and humans. Rather, it must be a function of similar physiology and lifestyle. To further explore the behavior of the dog gut microbiome, particularly in comparison to that of humans, we investigated the dog microbiome response to dietary intervention.
Effect of diet on the dog gut microbiome
Sixty-four dogs from two breeds were fed for 4 weeks on a common baseline diet (Base, diet details see [35]), followed by random assignment to one of two possible diet interventions: high-protein/low-carbohydrate (HPLC) or lower protein/higher carbohydrate (LPHC). The Base diet was more similar to the LPHC (Fig. 2a (top); Additional file 5: Table S5). To avoid the confounding effect of changes in the host phenotype, dogs were fed to maintain initial body weight (minimum energy requirement). Stool samples were collected before and at the end of the diet intervention (Fig. 2a). To control for possible batch effects, the study subjects were randomly split into two groups of 32 dogs and the procedure was repeated for each groups, at the same pet care center, 1 month apart. One of the dogs had to be excluded from analysis due to an antibiotic treatment for an infection unrelated to the study.
In response to the diets, we see a large shift in the overall taxonomic composition of the microbiome (Fig. 2 b–d; p ≤ 0.0001 using PERMANOVA [36] for diet effect; see also Additional file 12: Figure S6 which presents the distance boxplots for all samples, and Additional file 13: Figure S7, which presents the same results using Unifrac [37] and PINA [38] distances as an alternative). Specifically, the microbiome of HPLC-fed dogs shows a larger shift than that of LPHC-fed dogs, when compared to the Base diet, which is in line with the similarity between the LPHC and Base diets (Fig. 2a (top)). The consistency of the community shift argues for a direct effect of the diet as, in the absence of intervention, the dog microbiota has been reported to be stable over time, using 16S rRNA profiling [39].
In human studies, there have been several conflicting reports of the relationship of the Firmicutes:Bacteroidetes phylum ratio with obesity, with some authors reporting a higher ratio in obese individuals [40], no difference [41], or even a lower ratio [42]. For the dogs, we see a non-significant difference between overweight/obese (OW) and lean/normal (LN) dogs at the end of the baseline period, with higher Bacteroidetes in OW dogs (p = 0.064, two-tailed Wilcoxon test). However, we observe a large and significant difference induced by the diet, with the HPLC resulting in a higher Firmicutes:Bacteroidetes ratio in both OW and LN dogs than LPHC (Additional file 14: Figure S8).
At the genus level, the ratio of Bacteroides to Prevotella has also been found to be important in the human gut microbiome. It has been shown to change in response to diet, with higher Prevotella relative abundance being observed in high carbohydrate diets, while higher relative abundance of Bacteroides has been associated with a high protein diet [43, 44]. In our dog data, we observe that the ratio of Prevotella to Bacteroides is higher in the baseline and LPHC when compared to the HPLC (p = 4·10−10, Kruskal-Wallis test over the three diets, all pairwise comparisons are also significant, with p < 0.001; see Additional file 15: Figure S9), reproducing the observations in human diet studies. A differential impact of two diets differing in protein/carbohydrates on the gut microbiome of kittens was also previously reported [45, 46]. However, in that case, no global large shift was observed in the overall Firmicutes:Bacteroidetes ratio between diets, while, at the genus level, Megasphaera represented a large fraction of the microbiome of kittens fed an MPMC (moderate-protein/moderate-carbohydrate) diet. In the dog microbiome, this taxon represents only a small fraction of the microbiota (average relative abundance of 1.1·10−3), as it does in humans (average relative abundance of 2.8·104).
The highest overall shift in community composition relative to pre-treatment baseline was observed in HPLC-fed OW dogs (p = 0.00014, two-tailed Wilcoxon test on compositional dissimilarities between baseline and post-intervention samples, comparing HPLC/OW to the rest of data; see also Fig. 2d). This effect cannot be explained by any single genus, as it remains statistically significant in every case after removing any single, pair, or triplet of genera (all tests have p < 0.05; two-tailed Wilcoxon test, comparing HPLC/OW to the rest of data, as above). Rather, the shift seems to be driven by a combination of four genera: Lactobacillus, Prevotella, Streptococcus, and Turicibacter, all of which showed significantly higher abundance variation in HPLC/OW dogs than in all other subcohorts. Thus, the OW dogs’ microbiome was more sensitive to the dietary shift from base to HPLC (which was a more drastic intervention than the switch to LPHC). This is consistent with the view that their microbiome resides in a less stable state compared to those of the healthy LN population [47, 48].
Some taxa became detectable or undetectable in response to diet (the detection limit is ca. 2·10−5 in relative abundance). For example, Lactobacillus ruminis was not detected in any of the HPLC-fed dogs, even though it was present in 22% of the samples taken after baseline diet and was detected in 59% of LPHC-fed dogs (p = 8·10−6, Fisher’s exact test after Bonferroni correction; see Fig. 3a; Additional file 16: Figure S10). This is consistent with previous genome-based suggestions that this immuno-modulatory microbe may have an advantage in utilizing complex carbohydrates as a carbon source [49]. On the other hand, both Intestinibacter bartlettii and the entire Streptococcus genus are more frequently detected in dogs on the HPLC diet compared to both Base and LPHC. These strong prevalence effects suggest that these species may be amenable to modulation with prebiotics or with foods that selectively suppress taxa. One possibility for this increased prevalence may be how higher protein content directly advantages proteolytic fermenters or species which benefit indirectly from their metabolism in turn. Future, more detailed annotation of metabolic potential following from gut microbiome genes will allow comprehensive testing whether this effect explains the taxonomic changes observed under HPLC.
To identify major changes in functional composition, we linked the genes in the catalog to KEGG and CAZy enzyme classes and obtained functional profiles of the metagenomes [27]. The strength of the functional signals is exemplified by a penalized logistic regression classifier (see the “Methods” section; [50]) that can, based on either the functional or taxonomic profile of a sample, predict the diet which the dog was placed on (estimated by leave-one-out cross-validation; see Fig. 3b).
Of the genes that changed abundance in response to diet, five CAZy enzyme classes showed the strongest signal (Gehan’s test for doubly censored data [51], at a false discovery rate of 5%; Fig. 3c). Four glycohydrolase classes (GHs) become less abundant in the HPLC-fed dogs, which is consistent with these enzymes being involved in the metabolism of complex carbohydrates, while glycosyltransferase 6 (GT6) is more abundant in the guts of HPLC-fed dogs. Although the function of this ubiquitous enzyme in bacteria is still unclear [52], glycosyltransferases (GTs) in general catalyze formation of many different types of glycoproteins with important roles in cell-to-cell communication and recognition, thus perhaps utilizing or recycling carbohydrates.
We subsequently identified functionally interacting species by searching for co-abundant taxa across the dog samples and found two large groups of microbial genera which have significantly correlated gut abundances (Spearman r > 0.5 in absolute value, statistical significance tested with sparCC [53], FDR set at 5%), within one group, but are anticorrelated in abundance with those of the other (Fig. 4). The first group, more abundant in dogs fed the HPLC diet, consists mainly of genera in the Clostridiales order, while the second one is enriched for Bacteroidiales. In mice, a decrease in Clostridiales was accompanied by an increase in Bacteroidiales in response to induced inflammation [54], while increased Clostridiales and decreased Bacteroidetes have been reported in response to high-fat and high-sucrose diets [55]. As discussed above, better resolution in metabolic annotation of gut microbial genomes and metagenomes may allow testing to what extent direct diet effects such as higher nutrient availability for different fermenters drive these compositional changes.