Chemostat culture systems support diverse bacteriophage communities from human feces
© Santiago-Rodriguez et al. 2015
Received: 28 August 2015
Accepted: 27 October 2015
Published: 9 November 2015
Most human microbiota studies focus on bacteria inhabiting body surfaces, but these surfaces also are home to large populations of viruses. Many are bacteriophages, and their role in driving bacterial diversity is difficult to decipher without the use of in vitro ecosystems that can reproduce human microbial communities.
We used chemostat culture systems known to harbor diverse fecal bacteria to decipher whether these cultures also are home to phage communities. We found that there are vast viral communities inhabiting these ecosystems, with estimated concentrations similar to those found in human feces. The viral communities are composed entirely of bacteriophages and likely contain both temperate and lytic phages based on their similarities to other known phages. We examined the cultured phage communities at five separate time points over 24 days and found that they were highly individual-specific, suggesting that much of the subject-specificity found in human viromes also is captured by this culture-based system. A high proportion of the community membership is conserved over time, but the cultured communities maintain more similarity with other intra-subject cultures than they do to human feces. In four of the five subjects, estimated viral diversity between fecal and cultured communities was highly similar.
Because the diversity of phages in these cultured fecal communities have similarities to those found in humans, we believe these communities can serve as valuable ecosystems to help uncover the role of phages in human microbial communities.
Human body surfaces are inhabited by diverse viral communities, and the majority of those identifiable viruses are bacteriophages [1, 2]. The most well studied viral communities to date are those in the human gut and the oral cavity. Viruses in the gut are rapidly evolving , exhibiting subject-specificity , responding to dietary changes, and persisting over time . Similarly, viruses in the oral cavity are highly personalized  and highly persistent, possibly as a result of their ability to evade defense mechanisms utilized by oral bacteria [6–8]. The oral cavity has different biogeographic sites with specific viral communities whose membership differs significantly in periodontal health and disease . Limitations due to virus assembly from metagenome data, however, often result in overestimations of the diversity present in human viral communities . There also are viral communities inhabiting human skin  and the respiratory tract , but the role of viruses as members of the human microbial communities is not well understood.
Bacteriophages possess the capacity to alter microbial communities by either lysing their hosts or providing phenotypic advantages to recipient bacteria [13–15]. Phages may influence biogeochemical cycles in aquatic environments by decreasing the relative abundances of specific bacterial species [16, 17] or by supporting bacterial populations with the wide repertoire of metabolic-associated genes they possess [18, 19]. The ability of phages to influence microbial diversity has been hypothesized to have consequences for human health by altering the normal microbiota that may have protective effects against colonization by more pathogenic microorganisms. While some phages have lysogenic lifestyles and may contribute gene functions to their hosts in human microbial communities, others have lytic lifestyles and may be responsible for driving microbial diversity on human body surfaces. The subgingival crevice in periodontal disease is enriched for myoviruses  that typically have lytic lifestyles, which implicates them as potential drivers of microbial diversity in human disease. That microbial diversity in human disease that may be potentiated by viruses provides a significant rationale for more studies focused on the role of viruses as members of human microbial communities.
Studies of microbiota can be limited due to restrictions imposed by working with human subjects . For example, examining the effects of a lytic virus on human microbiota meets with ethical challenges that limit experimental design. Some studies have been valuable in discerning the effects of perturbations on human gut microbiota, but personalized individual microbial profiles and individual confounders such as age, diet, and medications may influence gut microbial composition [21–26]. The restrictions associated with working with human subjects and the potential for significant confounders despite well-designed experiments necessitate the development of cultivation-based systems capable of reproducing the complex interactions of the microbiota on human body surfaces. There currently are no known cultivation-based ecosystems that effectively reproduce the diversity of viral populations in humans.
Model microbial systems must be reproducible, highly stable, and must retain similar levels of diversity to the inocula from which they were derived [27, 28]. Interactions between host and phage have been previously modeled in gnotobiotic mice , which allowed for the tracking of phage dynamics with their individual hosts. The cellular microbiota of the human gut has been modeled by cultivating fecal samples in complex chemostat systems [20, 30]. These microbial communities reach an equilibrium that resembles the gut community structure from which they originated [27, 31–33]. Models of the oral microbiota in other culture-based systems also have been shown to reproduce much of the taxonomical and functional characteristics of the oral biofilm . These systems offer the potential to study the response of microorganisms to perturbations in controllable and reproducible environments that reduce the potential for confounders that often are encountered working with human subjects [20, 35]. None of these cultivation-based ecosystems have been shown to be inhabited by viral communities, but researchers previously have successfully cultivated individual viruses in these types of systems [36, 37]. We believe that because chemostat culture systems effectively reproduce much of the cellular microbiota in the gut, they may also be home to substantial viral communities. The goals of this study were as followings: (1) to demonstrate that cultured fecal microbial communities are home to robust viral communities, (2) to develop techniques to investigate the diversity of the viral communities in cultured fecal communities, and (3) to determine whether cultured fecal communities have similarities to viral communities present in human feces.
Human subjects and chemostat cultured communities
We recruited five human subjects through the University of Guelph and sampled their feces. Donors #1, #2, and #10 were a co-habiting family unit of father, mother, and child, respectively, while donors #8 and #9 were unrelated. Each fecal sample was homogenized and processed immediately into a chemostat vessel, and the remainder of the feces was stored until processing of the viromes could take place. Chemostat vessels were operated under conditions designed to mimic the human distal colon . Cultured microbial communities were taken from donors #1, #2, and #10 on days 4, 8, 12, 16, and 24 and from donors #8 and #9 on days 3, 6, 12, 18, and 24. Previous studies have shown that cellular microbial communities reach stable state in culture by day 24 . We tested for the presence of fluorescent-staining particles (FSPs) in feces and cultures, similar to those previously described, to indicate the likely presence of viruses in both sample types . On day 24, we found that there were numerous FSPs in both sample types, with a mean 3.7 × 109 FSPs in feces and 1.4 × 109 FSPs in the chemostat cultures for all subjects studied. The presence of such high densities of FSPs in both sample types strongly suggested the presence of substantial viral communities.
We isolated and processed viruses from both feces and chemostat cultures utilizing sequential filtration followed by cesium chloride density gradient centrifugation according to our previously described protocols . We sequenced the resulting viral DNA from the feces and chemostat cultures of the five donors using semiconductor sequencing  and produced 18,584,604 reads (619,487 reads per time point and sample type) of mean length 215 nucleotides (Additional file 1: Table S1). We used BLASTN to compare all viromes to the RDP 16S rRNA genes database (E-value <10−5) and found that all were free of 16S rRNA gene homologues. We also used BLASTN to search the viromes for homologues against a human reference genome, and some similar sequences (E-value <10−5) were identified and removed prior to further analysis. These data suggest that these chemostat and fecal viromes were relatively free of contaminating cellular nucleic acids.
Identification of viruses and viral families in cultured fecal material
We assembled virome reads from each subject to construct longer contigs, as this generally results in more productive searches for sequence similarities. We used several different assemblers including CLC Genomics Workbench, MetaVelvet , and IDBA-UD  in the construction of contigs and found that CLC Genomics Workbench produced fewer contigs with longer mean and maximum lengths and higher N50 values than the other assemblers tested (Additional file 2: Table S2). With the CLC assembler, 96.4 ± 1.3 % of all reads were assembled into contigs. Therefore, we utilized the contigs constructed using CLC Genomics Workbench throughout the study. The mean GC content for all contigs was 45.5 % (mean of 46.3 % for cultured viromes and 41.7 % for fecal viromes; p = 0.01) (Additional file 3: Figure S1, Panel A). The mean length amongst all contigs was 941 nucleotides (mean of 908 nucleotides for cultured viromes and 965 nucleotides for fecal viromes; p = 0.35) (Additional file 3: Figure S1, Panel B). The differences identified in GC content suggest that there may be features of viromes that are specific to both fecal and cultured communities.
We also analyzed the cultured viral communities using the MG-RAST Server  and categorized those with sequences similar to known phages according to their families. We found that in all five donors, their cultured communities at all time points had reads similar to known myoviruses, podoviruses, and siphoviruses (Additional file 3: Figure S5). The proportions of reads similar to those different phage families generally differed substantially within a subject over time and between different subjects. The relative proportions of phage families observed do not necessarily reflect those found in the fecal viromes of each subject, particularly in donors #8 and #9. Viruses from the family Microviridae also were found in the feces of four of the five subjects, but none were identified in the chemostat cultures.
Comparisons of fecal and cultured viral communities
We identified significant sequence similarities to each assembled viral contig by BLASTX analysis against the NCBI non-redundant database to decipher which viral genes had similarities in the chemostat cultures. The vast majority of the contigs had similarities to hypothetical phage proteins, proteins involved in replication/integration, restriction/modification enzymes, or tail fibers (Additional file 3: Figure S6). There were no significant differences identified in the relative proportions of contigs similar to different phage categories for either fecal or cultured phage communities regardless of the time point examined.
Viral homologues within and between subject groups
Percent similar within groupa
Percent similar between groupsa
25.04 ± 12.81
8.69 ± 5.44
25.41 ± 10.23
13.15 ± 8.11
Chemostat and fecal virome homologues within and between subjects
Percent similar within subjecta
Percent similar between subjectsa
54.38 ± 15.58
12.00 ± 5.26
47.07 ± 14.63
9.38 ± 5.47
44.01 ± 15.82
11.76 ± 6.26
50.43 ± 12.84
8.59 ± 3.70
54.19 ± 24.04
12.09 ± 8.85
Viral diversity in fecal and cultured communities
We next used the HVDI to perform rarefaction analysis to determine whether the viruses in the viromes had been adequately sampled and as a measure of whether the richness of viruses differed substantially between fecal and cultured communities. In this case, we calculated the HVDI based on the Chao1 index  because it penalizes more heavily for the presence of the rarer viral contigs in each sample. We found that there was no association between the sample type and the richness within the viral communities and that the diversity estimates approached asymptote in many cases, indicating that little additional viral diversity would have been identified through further sampling (Additional file 3: Figure S17).
Human body surfaces are home to large populations of viruses, whose role as members of human microbial communities is not well understood. In some environments, viral communities have been shown to be involved in driving bacterial diversity, yet no such data exists for human viral communities. We know that many of the phages that populate these communities are highly persistent and carry numerous pathogenic gene functions such as antibiotic resistance, which could help to shape bacterial community membership and its response to certain perturbations. Measuring their effects in humans is not without its challenges, which include the need to use drugs such as antibiotics, the need to reduce confounders such as diet variability, and the need to improve compliance with study protocols. The development of ecosystems that can approximate the dynamic interactions between phage communities and their cellular hosts can greatly reduce the reliance on human subjects in the characterization of human microbial communities.
Each virome in this study was subjected to MDA amplification due to the small amounts of viral DNA recovered from each donor. MDA amplification is known to introduce biases into sequence data , and it is unclear how MDA amplification biases could have affected these viromes. Many of the biases introduced often result when relatively low levels of starting DNA are utilized for amplification, which could have occurred in some of the viromes in this study. The relative conservation of viral genotypes across different time points in the chemostat cultures (Fig. 3) suggests that if MDA amplification biases affected the analysis of the viromes, the effects may have been relatively uniform across time points. The relative abundances of Microviridae, however, in the feces of four of the five subjects, may have been overrepresented through the use of MDA, as has previously been described . The absence of microviruses in the chemostat cultures suggests that their bacterial host species may not be well represented in chemostat cultures.
Different bacteriophage families often have different lifestyles. Because phages can be significant drivers of diversity in different ecosystems [3-5, 11, 12, 39, 50], we examined whether there were phages with significant sequence similarities to known phage families in our cultured viral communities. Caudoviruses are the phage families most often found in our prior analyses of human oral viral communities [9, 10] and were commonly identified in the viral cultures in this study (Additional file 3: Figure S5). Of the different types of caudoviruses, siphoviruses generally have lysogenic lifestyles, while podoviruses and myoviruses more often have lytic lifestyles. We identified similar sequences to each of these viral families in our analysis of cultured viral communities, which strongly suggests that both primarily lytic and lysogenic phages were present. Not all myoviruses and podoviruses, however, have lytic lifestyles, as demonstrated by the presence of Enterobacteria phages FIAA91ss, IME10 (Fig. 2), and HK620 (Additional file 3: Figure S4), which have predicted gene functions that indicate their probable lysogenic lifestyles. The presence of each of these caudovirus families indicates that each of these virus types is viable in cultured communities. We found reads matching Enterobacteria phages FIAA91ss, IME10, and HK620 in the husband and wife but not in their daughter or the other donors. This suggests that these viruses were transmitted between husband and wife, but not to their offspring, potentially due to a lack of suitable host bacteria in the daughter.
Because of the high diversity in human phage communities and the varying relationships between host and phages, the relative abundances of phages do not necessarily reflect those of their host bacteria. This phenomenon was demonstrated in a prior study of human oral viral communities  and in a more recent study in the guts of humans with inflammatory bowel disease . The data in this study support this finding, as the relative abundances of phage BLASTX hits from chemostat cultures and feces (Fig. 5) were not reflective of the relative abundances of the bacteria (Additional file 3: Figure S14). While BLASTX hit profiles do not necessarily represent taxonomic classifications, the abundances of phylum Bacteroidetes found in the phage BLASTX profiles and the 16S rRNA gene profiles suggest that many phages parasitizing these bacteria were present in the feces and chemostat cultures. Few Proteobacteria were identified through analysis of 16S rRNA genes (Additional file 3: Figure S14), yet the abundance of phage hits to Proteobacteria was high (Fig. 5). These results may have been influenced by an unequal representation of phage from Proteobacteria in available databases, where a high relative abundance of phage from Proteobacteria renders us more likely to identify hits to Proteobacteria even when Proteobacteria are not the hosts of these phage.
Our prior studies of human viral communities have demonstrated that diversity in these communities is generally overestimated . In that study, we found numerous different contigs that could not assemble, which actually belonged to the same phages. We developed a method to reduce the overestimation of viral diversity by finding high levels of similarity amongst these contigs and assigning their contig spectra to the same virus genotype. We do not believe that in every case high levels of similarity necessarily represent the same virus; however, utilizing this technique allowed us to estimate viral community diversity within 12 % of the actual diversity in our simulated communities across most evenness levels. This technique allowed for us to provide estimates of sequencing depth adequacy as well as comparisons of diversity amongst fecal and cultured viral communities. While the data indicate that most of the viral diversity in these communities could be identified by sequencing <20,000 reads (Additional file 3: Figure S15), the greater number of reads used in this study was necessary to assemble more reliable contig spectra. The rarefaction analyses strongly suggested that further sequencing would not have added substantially to the diversity estimates present in cultured or fecal viral communities. We believe that tools such as the HVDI add to available methodologies for examining viral community diversity and will be of great utility in understanding the responses of viral communities to perturbations.
One of the major goals of this work was to examine how closely cultured communities might approximate human indigenous phage communities. Identifying numerous FSPs and many reconstructed phages in the cultured communities (Figs. 2 and 4, and Additional file 3: Figures S10, S11, S12 and S13) strongly suggests that chemostat culture systems are fully capable of supporting robust phage communities. Chemostat cultured communities have been known to support the viability of individual phages ; however, the data presented here indicate that chemostat cultures can support entire communities of phages as well. Our focus on contigs rather than virome reads allowed us to characterize 94.9 % of the viromes (far greater than often is reported in virome studies), which we believe provides a broad overview of the phage present in chemostat cultures and feces. The significant drop in diversity and evenness in the viromes in donors #8 and #9 by day 18 cannot be attributed to a drop in bacterial diversity (Additional file 3: Figure S14), and the BLASTX profiles in these subjects (Fig. 5) suggest that while diversity was diminished, it likely was diminished across different bacterial host lineages. While these cultured communities do not perfectly approximate the diversity of phages in the human gut, as evidenced by the generally higher similarities within chemostat viromes compared with fecal viromes (Fig. 3), the ability to approximate diversity in some human viromes (Fig. 8) should prove useful for furthering our understanding of host/phage interactions in humans. We did not perform detailed comparisons of phages and their putative bacterial hosts in each donor because BLASTX is generally considered insufficient to accurately predict the hosts of each phage.
By establishing phage communities that have some similarities to those found on human body surfaces, several important questions can be addressed. These questions include the following: (1) what is the role of phages in driving the diversity of the bacteria in human microbial communities?, (2) how do perturbations such as antibiotics impact human phage communities?, (3) how does the sharing of phage communities between individuals impact microbial community membership?, and (4) what are the dynamics of most phages as members of human microbial communities? While there are quantifiable differences between the phage communities present in feces and cultured communities, there also are many similarities. The relative number of FSPs in both sample types are similar, the profiles of beta diversity strongly suggest a conservation of some phage community members across fecal and cultured communities, and the relative levels of phage diversity between communities in some subjects were highly similar. By establishing cultured phage communities, we can begin to understand the role and contributions of phages to human microbial communities.
Ethics, consent, and permissions
Human subject recruitment and enrollment in this study was approved by The Research Ethics Board of the University of Guelph REB#13AP008 and 10JL002. Each subject signed a written informed consent confirming their willingness to participate in this study.
Five healthy donors provided fecal samples: donor #1 (male, 44 years old), donor #2 (female, 41 years old), donor #8 (female, 26 years old), donor #9 (male, 25 years old), and donor #10 (female, 7 years old). Donors #1, #2, and #10 were a co-habiting family unit of father, mother, and child, respectively. Donors #8 and #9 were unrelated. None of the donors had a recent history of antibiotic treatment for 9 months prior to sampling. Each donor provided at least 5 g of fresh fecal samples for the chemostat cultures. Donor #10 provided a sample in the home environment which was immediately wrapped in plastic cling wrap to exclude air and then frozen at −20 °C for overnight transportation to the lab. The remaining donors provided fresh samples.
All samples were placed immediately into an anaerobic chamber (90 % N2, 5 % CO2, 5 % H2) upon receipt at the lab; for the fresh samples, this was within 30 min of collection. The sample from donor #10 was allowed to thaw in the chamber. A modified Infors Multifors system was used to run chemostat cultures modeling the human distal colon environment, as described by McDonald et al. . A 10 % (w/v) slurry was prepared from each donor by homogenizing feces in pre-reduced growth medium using a stomacher. For every 1 L of medium, the following components were included in the growth medium: peptone water, 2 g; yeast extract, 2 g, NaHCO3, 2 g; CaCl2, 0.01 g; pectin (from citrus), 2 g; xylan (from beechwood), 2 g; arabinogalactan, 2 g; starch (from wheat, unmodified), 5 g; casein, 3 g; inulin (from Dahlia tubers), 1 g; bile salts, 0.5 g NaCl, 0.1 g; L-cysteine HCl, 0.5 g; K2HPO4, 0.04 g; KH2PO4, 0.04 g; MgSO4, 0.01 g; hemin, 0.005 g; and menadione, 0.001 g (all components from Sigma Aldrich). Growth media was stored at 4 °C until use for a maximum of 2 weeks. The fecal slurry in growth medium was then centrifuged to remove large particles , and the supernatant was used as the inoculum for each experiment by adding 100 mL into 300 mL of sterile growth medium in each vessel. The pH within each vessel was then adjusted to 6.9–7.0, and the cultures gently and continually agitated and maintained at 37 °C. Vessels were run in batch mode for 24 h to allow inoculum recovery time (adjustment from in vivo to in vitro conditions) and to avoid wash-out. The pumps were then switched on, and the retention rate set to 16.67 mL/h−1 with constant sparging of O2 free N2 gas to maintain positive pressure and anaerobiosis. Each chemostat vessel was sampled daily by aseptically removing 4 mL of culture directly from the vessel contents, and all samples were archived at −80 °C. Aliquots of the original fecal samples also were archived at −80 °C for virome processing.
Preparation and sequencing of viromes
Fecal viromes were prepared by diluting 0.4 g of feces in 4 mL of SM buffer. The fecal samples were vortexed vigorously for 40 min to separate viral particles, spun at 4000×g for 10 min to pellet the remaining solid material, and the supernatant was treated in an identical manner to that of the chemostat cultures. A small portion (10 μL) of the supernatant from each donor was resuspended in 190 μL of 0.02-μm filtered PBS and their counts per milliliter determined by epifluorescence microscopy . Chemostat samples and fecal supernatants were filtered sequentially using 0.45 and 0.2 μm filters (VWR, Radnor, PA) to remove cellular and other debris and then purified on a cesium chloride gradient according to previously described protocols . Only the fraction with a density corresponding to most known bacteriophages  was retained, further purified on Amicon YM-100 protein purification columns (Millipore, Inc., Bellerica, MA), treated with DNase I, and subjected to lysis and DNA purification using the Qiagen UltraSens Virus kit (Qiagen, Valencia, CA). Recovered DNA was screened for the presence of contaminating bacterial nucleic acids by quantitative 16S rRNA gene PCR using primers 8F (AGAGTTTGATCCTGGCTCAG) and 357R (CTGCTGCCTYCCGTA) in Power SYBR Green PCR Mastermix (Thermo Fisher Scientific, Carlsbad, CA). No products were detected in any of the viromes after 35 cycles, which does not exclude the presence of contaminating bacterial nucleic acids but indicates that they were not present at dominant levels. Resulting DNA was amplified using GenomiPhi Hy MDA amplification (GE Healthcare, Pittsburgh, PA), fragmented to roughly 200 to 400 bp using a Bioruptor (Diagenode, Denville, NJ), and utilized as input to create libraries using the Ion Plus Fragment Library Kit according to manufacturer’s instructions. Libraries then were sequenced using 314 or 316 chips on an Ion Torrent Personal Genome Machine (PGM; Life Technologies, Grand Island, NY)  producing an average read length of approximately 215 bp for each sample.
Analysis of viromes
Due to the error rate of semiconductor sequencing , we trimmed each read according to modified Phred scores of 0.5 using CLC Genomics Workbench 8.01 (CLC bio USA, Cambridge, MA), removed any low complexity reads with ≥8 consecutive homopolymers, and removed any reads with substantial length variation (<50 nucleotides or >300 nucleotides) or ambiguous characters prior to further analysis. Each virome was screened for contaminating bacterial and human nucleic acids using BLASTN analysis (E-value <10−5) against the Ribosomal Database Project 16S rRNA genes database  and the human reference database available at ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/. Any reads with significant sequence similarities to human sequences were removed prior to further analysis. Length and GC content variation amongst contigs was assessed using Box and Whiskers plots created using Microsoft Excel 2007 (Microsoft Corp., Redman, WA). All reads were assembled using CLC Genomics Workbench 8.01 based on 98 % identity with a minimum of 50 % read overlap, which were more stringent than criteria developed to discriminate between highly related viruses . The assembly method works by using a de Bruijn graph technique and various word lengths, similar to that used in the assembler Velvet . We also used MetaVelvet  and IDBA-UD  in the construction of contigs, but the CLC assembler produced fewer contigs of higher mean lengths with greater N50 values (Additional file 2: Table S2). Because the shortest reads were 50 nucleotides, the minimum tolerable overlap was 25 nucleotides, and the average overlap was no less than 100 nucleotides depending on the characteristics of each virome. The consensus sequence for each contig was constructed according to majority rule, and any contigs <200 nucleotides or with ambiguous characters were removed prior to further analysis. Read mapping of viromes to a combined database of viruses (www.phantome.org; ftp://ftp.ncbi.nih.gov/genomes/Viruses/) was performed using CLC Genomics Workbench 8.01 (CLC bio USA, Cambridge, MA) and were mapped using 98 % identity over a minimum of 50 % of the read length. For each donor, we also utilized a separate technique for assembly by constructing global assemblies from all contigs from all time points using 98 % identity over a minimum of 50 % overlap. Viral contigs were analyzed using FGenesV (Softberry Inc, Mount Kisco, NY) for ORF prediction and individual ORFs analyzed using BLASTP analysis against the NCBI non-redundant database (E-value <10−5). If the best hit was to a gene with no known function, lower level hits were used for the annotation as long as they had known putative function and still met the E-value cutoff (10−5).
Contigs were annotated using BLASTX against the NCBI NR database with an E-value cutoff value of 10−5. Specific viral sequences were identified by parsing BLASTX results for known viral genes including replication, structural, transposition, restriction/modification, hypothetical, and other genes previously found in viruses for which the E-value was at least 10−5. Each individual virome contig was annotated using this technique; however, if the best hit for any portion of the contig was to a gene with no known function, lower level hits were used as long as they had known function and still met the E-value cutoff. The annotation data were compiled by the number of reads used to assemble each contig for each subject and used to determine the relative proportions of contigs that contained viral sequences. The phyla of the BLASTX best hits for each annotated contig were used to create profiles in each donor and sample type. We utilized the average coverage in the assemblies of each contig to determine the relative abundance profiles of different phyla to compensate for viruses that may be more abundant than others. This technique prevented reads involved in the assembly of the same virus contigs from being assigned to different putative host phyla based on different BLASTX similarities. Determination of the relative abundances of virus families was determined by BLASTX analysis of the SEED database using MG-RAST .
Analysis of shared sequence similarities present in each virome was performed by creating custom BLAST databases for each virome, comparing each database with all other viromes using BLASTN analysis (E-value <10−10). Principal coordinates analysis (PCOA) was performed on virome contigs with Bray Curtis distances using Qiime . We also utilized a separate technique for assembly by constructing global assemblies from all contigs from all subjects and time points using 98 % identity over a minimum of 50 % overlap. The contribution of each subject and time point to each assembly was assessed and utilized to determine relative persistence of phages over time in the chemostat cultures and as input for PCOA and for heat matrix analysis using Microsoft Excel. We also used the profiles of BLASTX best hits amongst all subjects and time points as input for PCOA. Beta diversity based on Bray Curtis distances was used as input for each PCOA.
Analysis of 16S rRNA genes
Genomic DNA was prepared from the feces of each subject and time point using the Qiagen QIAamp DNA Stool Mini kit (Qiagen, Valencia, CA). We amplified the bacterial 16S rRNA gene V1-V2 hypervariable region using the forward primer 8F fused with the Ion Torrent Adaptor A sequence and one of 23 unique 10 base pair barcodes and reverse primer 357R fused with the Ion Torrent Adaptor P1 from the each donor and sample type . PCR reactions were performed using Platinum PCR High Fidelity SuperMix (Invitrogen, Carlsbad, CA) with the following cycling parameters: 94 °C for 10 min, followed by 30 cycles of 94 °C for 30 s, 53 °C for 30 s, 72 °C for 30 s, and a final elongation step of 72 °C for 10 min. Resulting amplicons were purified on a 2 % agarose gel stained with SYBR Safe (Invitrogen, Carlsbad, CA) using the MinElute PCR Purification kit (Qiagen, Valencia, CA). Amplicons were further purified with Ampure XP beads (Beckman-Coulter, Brea, CA), and molar equivalents were determined for each sample using a Bioanalyzer 2100 HS DNA Kit (Agilent Technologies, Santa Clara, CA). Samples were pooled into equimolar proportions and sequenced on 314 chips using an Ion Torrent PGM according to manufacturer’s instructions (Life Technologies, Grand Island, NY) . Resulting sequence reads were removed from the analysis if they were <180 nucleotides, had any barcode or primer errors, contained any ambiguous characters, or contained any stretch of >8 homopolymers. Sequences were assigned to their respective samples based on a 10-nucleotide barcode sequence and were analyzed further using the Qiime pipeline . Briefly, representative OTUs from each set were chosen at a minimum sequence identity of 97 % using UClust  and aligned using PyNast  against the Greengenes database . Multiple alignments then were used to create phylogenies using FastTree , and taxonomy was assigned to each OTU using the RDP classifier [62, 63]. PCOA was performed based on beta diversity using weighted Unifrac distances .
To assess whether virome contigs had significant overlap within or between donors and sample types, we performed a permutation test based on resampling (10,000 iteration). We simulated the distribution of the fraction of virome sequences with significant similarities from two different sample types within a donor. For each set, we computed the summed fraction of sequences with significant similarities using 1000 random contigs between different donors, and from these computed an empirical null distribution of our statistic of interest (the fraction of shared similar sequences). The simulated statistics within each donor were referred to the null distribution of inter-donor comparisons, and the p value was computed as the fraction of times the simulated statistic for each exceeded the observed statistic. For analysis of fecal versus cultured viromes, a randomly chosen donor from the cultured virome group was compared with a randomly selected donor from the fecal group to determine the null distribution of the fraction of shared contigs. We then estimated the fraction of shared sequences with significant similarities from randomly chosen donors within the cultured virome group and compared with the empirical null distribution from simulated inter-group values. Intra-subject comparisons were excluded from this analysis. We estimated the p value based on the fraction of times the intra-group statistic exceeded that for the null statistic.
Homologous virus diversity index and virome construction
To measure alpha diversity in the viral communities, we utilized a technique termed the Homologous Virus Diversity Index . The technique is based on finding high levels of similarity amongst contigs within viromes that potentially belong to the same or a highly similar viral genotype but were placed into separate contigs due to the limitations of the assembly process . We validated the technique by constructing viromes composed of randomly selected viruses amongst both the NCBI (ftp://ftp.ncbi.nlm.nih.gov/genomes/Viruses/) and Phantome viral databases (www.phantome.org). First, a random set of viruses was chosen from amongst the databases for each virome, and viromes were constructed that contained 10 viruses, 50 viruses, 100 viruses, 500 viruses, and 1000 viruses. We constructed 10 different viromes at each level of different viral genotypes. The reads in each virome were of mean length 200 bp and were chosen randomly across the genomes of each virus. The shortest reads were 150 bp, and the longest reads were 250 bp in each constructed virome. Each virome also was constructed to meet specific evenness requirements, where an evenness of close to 1 would indicate a population consisting of different viruses all of similar relative abundances, and an evenness close to 0 would indicate a population where the relative abundance of a few viruses are much greater than others. We utilized evenness as criteria to construct the viromes because higher evenness values would prove more difficult for the assembly process. For each constructed virome, we created contig spectra by counting the number of reads assigned to each different virus in the virome. The contig spectra then were used as surrogates for population structures in determination of the Shannon Index . The actual evenness values for each constructed virome were determined using the equation H/ln(S), where H is the Shannon Index value and S is the total number of viruses in each virome. We also determined the HVDI for each constructed virome by assembling the reads using 98 % identity over a minimum of 50 % of the read length using CLC Genomics Workbench 8.01 (CLC bio USA, Cambridge, MA). The resulting contigs then were subjected to BLASTN analysis against a database of contigs from the exact same subject, and contigs with high degrees of similarity (E-value < 10−20) over 50 % of the length of the shorter contig were assigned to the same viral genotype. The spectra from each individual contig that were assigned to the same genotype were added to a corrected contig spectrum for each subject and those spectra used as inputs for the Shannon Index  to calculate the HVDI. The actual Shannon Index values for each constructed virome were compared to the results of the HVDI in each case, and the HVDI estimates exceeded the Shannon Index by approximately 12 % across most evenness levels. Rarefactions were determined for each donor and time point by randomly sampling up to 20,000 reads. The randomly sampled reads then were assigned to their respective contigs by comparing them with the corrected contig spectra generated after the BLASTN analysis. We then constructed new contig spectra based on the sampled reads and their assignments to different viral genotypes and used those spectra in calculations of the HVDI. For the rarefactions, we calculated the HVDI using the Chao1 index , which utilizes the relative numbers of viral contigs created from only a single read (singleton) or from only two reads (doubleton) as input.
Availability of data and materials
All sequences including viromes and 16S rRNA genes are available for download in the MG-RAST database (metagenomics.anl.gov/) under the project “Chemostat,” or project #10563.
Supported by the Burroughs Wellcome Fund grant to DTP, an Ontario Ministry of Agriculture Food and Rural Affairs (OMAFRA)/University of Guelph partnership grant, a Physician’s Services Incorporated grant, and a US Department of Defense CDMRP grant (W81XWH-13-1-0376) to E.A-V.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Hofer U. Viral evolution: variation in the gut virome. Nat Rev Microbiol. 2013;11:596.PubMedGoogle Scholar
- Haynes M, Rohwer F. The human virome. In: Metagenomics of the human body. New York: Springer; 2011. p. 63–77.View ArticleGoogle Scholar
- Minot S, Bryson A, Chehoud C, Wu GD, Lewis JD, Bushman FD. Rapid evolution of the human gut virome. Proc Natl Acad Sci U S A. 2013;110:12450–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Reyes A, Haynes M, Hanson N, Angly FE, Heath AC, Rohwer F, et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature. 2010;466:334–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Minot S, Sinha R, Chen J, Li H, Keilbaugh SA, Wu GD, et al. The human gut virome: inter-individual variation and dynamic response to diet. Genome Res. 2011;21:1616–25.PubMed CentralView ArticlePubMedGoogle Scholar
- Naidu M, Robles-Sikisaka R, Abeles SR, Boehm TK, Pride DT. Characterization of bacteriophage communities and CRISPR profiles from dental plaque. BMC Microbiol. 2014;14:175.PubMed CentralView ArticlePubMedGoogle Scholar
- Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat Rev Microbiol. 2010;8:317–27.View ArticlePubMedGoogle Scholar
- Pride DT, Sun CL, Salzman J, Rao N, Loomer P, Armitage GC, et al. Analysis of streptococcal CRISPRs from human saliva reveals substantial sequence diversity within and between subjects over time. Genome Res. 2011;21:126–36.PubMed CentralView ArticlePubMedGoogle Scholar
- Ly M, Abeles SR, Boehm TK, Robles-Sikisaka R, Naidu M, Santiago-Rodriguez T, et al. Altered oral viral ecology in association with periodontal disease. MBio. 2014;5:e01133–14.PubMed CentralView ArticlePubMedGoogle Scholar
- Abeles SR, Robles-Sikisaka R, Ly M, Lum AG, Salzman J, Boehm TK, et al. Human oral viruses are personal, persistent and gender-consistent. ISME J. 2014;8:1753–67.PubMed CentralView ArticlePubMedGoogle Scholar
- Foulongne V, Sauvage V, Hebert C, Dereure O, Cheval J, Gouilh MA, et al. Human skin microbiota: high diversity of DNA viruses identified on the human skin by high throughput sequencing. PLoS One. 2012;7:e38499.PubMed CentralView ArticlePubMedGoogle Scholar
- Willner D, Furlan M, Haynes M, Schmieder R, Angly FE, Silva J, et al. Metagenomic analysis of respiratory tract DNA viral communities in cystic fibrosis and non-cystic fibrosis individuals. PLoS One. 2009;4:e7370.PubMed CentralView ArticlePubMedGoogle Scholar
- Heldal M, Bratbak G. Production and decay of viruses in aquatic environments. Mar Ecol Prog Ser. 1991;72:205–12.View ArticleGoogle Scholar
- Doolittle M, Cooney J, Caldwell D. Lytic infection of Escherichia coli biofilms by bacteriophage T4. Can J Microbiol. 1995;41:12–8.View ArticlePubMedGoogle Scholar
- Fortier LC, Sekulovic O. Importance of prophages to evolution and virulence of bacterial pathogens. Virulence. 2013;4:354–65.PubMed CentralView ArticlePubMedGoogle Scholar
- Ventura M, Sozzi T, Turroni F, Matteuzzi D, van Sinderen D. The impact of bacteriophages on probiotic bacteria and gut microbiota diversity. Genes Nutr. 2011;6:205–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Suttle CA. Marine viruses—major players in the global ecosystem. Nat Rev Microbiol. 2007;5:801–12.View ArticlePubMedGoogle Scholar
- Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, et al. Functional metagenomic profiling of nine biomes. Nature. 2008;452:629–32.View ArticlePubMedGoogle Scholar
- Clokie MR, Millard AD, Letarov AV, Heaphy S. Phages in nature. Bacteriophage. 2011;1:31–45.PubMed CentralView ArticlePubMedGoogle Scholar
- Macfarlane GT, Macfarlane S. Models for intestinal fermentation: association between food components, delivery systems, bioavailability and functional interactions in the gut. Curr Opin Biotechnol. 2007;18:156–62.View ArticlePubMedGoogle Scholar
- Zoetendal EG, Akkermans AD, De Vos WM. Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific communities of active bacteria. Appl Environ Microbiol. 1998;64:3854–9.PubMed CentralPubMedGoogle Scholar
- Russell SL, Gold MJ, Reynolds LA, Willing BP, Dimitriu P, Thorson L, et al. Perinatal antibiotic-induced shifts in gut microbiota have differential effects on inflammatory lung diseases. J Allergy Clin Immunol. 2014;135(1):100–9. doi:10.1016/j.jaci.2014.06.027.View ArticlePubMedGoogle Scholar
- Ellekilde M, Selfjord E, Larsen CS, Jakesevic M, Rune I, Tranberg B, et al. Transfer of gut microbiota from lean and obese mice to antibiotic-treated mice. Sci Rep. 2014;4:5922.PubMed CentralView ArticlePubMedGoogle Scholar
- Lu N, Hu Y, Zhu L, Yang X, Yin Y, Lei F, et al. DNA microarray analysis reveals that antibiotic resistance-gene diversity in human gut microbiota is age related. Sci Rep. 2014;4:4302.PubMed CentralPubMedGoogle Scholar
- Perez-Cobas AE, Artacho A, Knecht H, Ferrus ML, Friedrichs A, Ott SJ, et al. Differential effects of antibiotic therapy on the structure and function of human gut microbiota. PLoS One. 2013;8:e80201.PubMed CentralView ArticlePubMedGoogle Scholar
- Hu Y, Yang X, Qin J, Lu N, Cheng G, Wu N, et al. Metagenome-wide analysis of antibiotic resistance genes in a large cohort of human gut microbiota. Nat Commun. 2013;4:2151.PubMedGoogle Scholar
- Brace C, Gloor GB, Ropeleski M, Allen-Vercoe E, Petrof EO. Microbial composition analysis of Clostridium difficile infections in an ulcerative colitis patient treated with multiple fecal microbiota transplantations. J Crohns Colitis. 2014;8:1133–7.View ArticlePubMedGoogle Scholar
- Freter R, Brickner H, Botney M, Cleven D, Aranki A. Mechanisms that control bacterial populations in continuous-flow culture models of mouse large intestinal flora. Infect Immun. 1983;39:676–85.PubMed CentralPubMedGoogle Scholar
- Reyes A, Wu M, McNulty NP, Rohwer FL, Gordon JI. Gnotobiotic mouse model of phage-bacterial host dynamics in the human gut. Proc Natl Acad Sci U S A. 2013;110:20236–41.PubMed CentralView ArticlePubMedGoogle Scholar
- McDonald JA, Schroeter K, Fuentes S, Heikamp-Dejong I, Khursigara CM, de Vos WM, et al. Evaluation of microbial community reproducibility, stability and composition in a human distal gut chemostat model. J Microbiol Methods. 2013;95:167–74.View ArticlePubMedGoogle Scholar
- Crowther GS, Chilton CH, Todhunter SL, Nicholson S, Freeman J, Baines SD, et al. Development and validation of a chemostat gut model to study both planktonic and biofilm modes of growth of Clostridium difficile and human microbiota. PLoS One. 2014;9:e88396.PubMed CentralView ArticlePubMedGoogle Scholar
- Petrof EO, Gloor GB, Vanner SJ, Weese SJ, Carter D, Daigneault MC, et al. Stool substitute transplant therapy for the eradication of Clostridium difficile infection: “RePOOPulating” the gut. Microbiome. 2013;1:3.PubMed CentralView ArticlePubMedGoogle Scholar
- Shahinas D, Silverman M, Sittler T, Chiu C, Kim P, Allen-Vercoe E, et al. Toward an understanding of changes in diversity associated with fecal microbiome transplantation based on 16S rRNA gene deep sequencing. MBio. 2012;3:5. doi:10.1128/mBio.00338-12.View ArticleGoogle Scholar
- Edlund A, Yang Y, Hall AP, Guo L, Lux R, He X, et al. An in vitro biofilm model system maintaining a highly reproducible species and metabolic diversity approaching that of the human oral microbiome. Microbiome. 2013;1:25.PubMed CentralView ArticlePubMedGoogle Scholar
- Crowther GS, Chilton CH, Todhunter SL, Nicholson S, Freeman J, Baines SD, et al. Comparison of planktonic and biofilm-associated communities of Clostridium difficile and indigenous gut microbiota in a triple-stage chemostat gut model J Antimicrob Chemother. 2014;69:2137–47.View ArticlePubMedGoogle Scholar
- Fischer CR, Yoichi M, Unno H, Tanji Y. The coexistence of Escherichia coli serotype O157:H7 and its specific bacteriophage in continuous culture. FEMS Microbiol Lett. 2004;241:171–7.View ArticlePubMedGoogle Scholar
- Mizoguchi K, Morita M, Fischer CR, Yoichi M, Tanji Y, Unno H. Coevolution of bacteriophage PP01 and Escherichia coli O157:H7 in continuous culture. Appl Environ Microbiol. 2003;69:170–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Noble RT, Fuhrman JA. Use of SYBR Green I for rapid epifluorescence counts of marine viruses and bacteria. Aquat Microb Ecol. 1998;14:113–8.View ArticleGoogle Scholar
- Pride DT, Salzman J, Haynes M, Rohwer F, Davis-Long C, White 3rd RA, et al. Evidence of a robust resident bacteriophage population revealed through analysis of the human salivary virome. ISME J. 2012;6:915–26.PubMed CentralView ArticlePubMedGoogle Scholar
- Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475:348–52.View ArticlePubMedGoogle Scholar
- Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. 2012;40:e155.PubMed CentralView ArticlePubMedGoogle Scholar
- Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.View ArticlePubMedGoogle Scholar
- Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M, et al. The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008;9:386.PubMed CentralView ArticlePubMedGoogle Scholar
- Robles-Sikisaka R, Ly M, Boehm T, Naidu M, Salzman J, Pride DT. Association between living environment and human oral viral ecology. ISME J. 2013;7:1710–24.PubMed CentralView ArticlePubMedGoogle Scholar
- Dutilh BE, Cassman N, McNair K, Sanchez SE, Silva GG, Boling L, et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat Commun. 2014;5:4498.PubMed CentralView ArticlePubMedGoogle Scholar
- Gotelli NJ, Colwell RK. Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecol Lett. 2001;4:379–91.View ArticleGoogle Scholar
- Santiago-Rodriguez TM, Ly M, Bonilla N, Pride DT. The human urine virome in association with urinary tract infections. Front Microbiol. 2015;6:14.PubMed CentralPubMedGoogle Scholar
- Chao A. Non-parametric estimation of the number of classes in a population. Scand J Stat. 1984;11:265–70.Google Scholar
- Kim KH, Bae JW. Amplification methods bias metagenomic libraries of uncultured single-stranded and double-stranded DNA viruses. Appl Environ Microbiol. 2011;77:7663–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Willner D, Furlan M, Schmieder R, Grasis JA, Pride DT, Relman DA, et al. Metagenomic detection of phage-encoded platelet-binding factors in the human oral cavity. Proc Natl Acad Sci U S A. 2011;108 Suppl 1:4547–53.PubMed CentralView ArticlePubMedGoogle Scholar
- Norman JM, Handley SA, Baldridge MT, Droit L, Liu CY, Keller BC, et al. Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell. 2015;160:447–60.View ArticlePubMedGoogle Scholar
- Murphy FA, Fauquet CM, Bishop DHL, Ghabrial SA, Jarvis AW, Martelli GP, et al. Virus taxonomy: sixth report of the international committee on taxonomy of viruses, vol. Supplement 10. New York: Springer; 1995.Google Scholar
- Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. The ribosomal database project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2009;37:D141–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, et al. Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci U S A. 2002;99:14250–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Whiteley AS, Jenkins S, Waite I, Kresoje N, Payne H, Mullan B, et al. Microbial 16S rRNA Ion Tag and community metagenome sequencing using the Ion Torrent (PGM) Platform. J Microbiol Methods. 2012;91:80–8.View ArticlePubMedGoogle Scholar
- Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–1.View ArticlePubMedGoogle Scholar
- Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics. 2010;26:266–7.PubMed CentralView ArticlePubMedGoogle Scholar
- DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006;72:5069–72.PubMed CentralView ArticlePubMedGoogle Scholar
- Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–9.View ArticlePubMedGoogle Scholar
- Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26:1641–50.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73:5261–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Lozupone C, Hamady M, Knight R. UniFrac—an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics. 2006;7:371.PubMed CentralView ArticlePubMedGoogle Scholar