Optimizing protocols for extraction of bacteriophages prior to metagenomic analyses of phage communities in the human gut
Microbiome volume 3, Article number: 64 (2015)
The human gut is densely populated with archaea, eukaryotes, bacteria, and their viruses, such as bacteriophages. Advances in high-throughput sequencing (HTS) as well as bioinformatics have opened new opportunities for characterizing the viral communities harbored in our gut. However, limited attention has been given to the efficiency of protocols dealing with extraction of phages from fecal communities prior to HTS and their impact on the metagenomic dataset.
We describe two optimized methods for extraction of phages from fecal samples based on tangential-flow filtration (TFF) and polyethylene glycol precipitation (PEG) approaches using an adapted method from a published protocol as control (literature-adapted protocol (LIT)). To quantify phage recovery, samples were spiked with low numbers of c2, ϕ29, and T4 phages (representatives of the Siphoviridae, Podoviridae, and Myoviridae families, respectively) and their concentration (plaque-forming units) followed at every step during the extraction procedure. Compared with LIT, TFF and PEG had higher recovery of all spiked phages, yielding up to 16 times more phage particles (PPs) and up to 68 times more phage DNA per volume, increasing thus the chances of extracting low abundant phages. TFF- and PEG-derived metaviromes showed 10 % increase in relative abundance of Caudovirales and unclassified phages infecting gut-associated bacteria (>92 % for TFF and PEG, 82.4 % for LIT). Our methods obtained lower relative abundance of the Myoviridae family (<16 %) as compared to the reference protocol (22 %). This decline, however, was not considered a true loss of Myoviridae phages but rather a greater level of extraction of Siphoviridae phages (TFF and PEG >32.5 %, LIT 22.6 %), which was achieved with the enhanced conditions of our procedures (e.g., reduced filter clogging). A high degree of phage diversity in samples extracted using TFF and PEG was documented by transmission electron microscopy.
Two procedures (TFF and PEG) for extraction of bacteriophages from fecal samples were optimized using a set of spiked bacteriophages as process control. These protocols are highly efficient tools for extraction and purification of PPs prior to HTS in phage-metavirome studies. Our methods can be easily modified, being thus applicable and adjustable for in principle any solid environmental material in dissolution.
The human gut hosts trillions of microbial cells belonging to all three domains of life, but among them, bacteria are the microorganisms dominating this highly competitive environment . During the last decade, characterization of the gut microbiome (GM) has received extensive interest driven by a number of novel findings showing the influence of GM on human health and disease [2–4]. The focus, however, has mainly been oriented to characterize the prokaryote members of the GM, while efforts for characterizing the diversity and structure of bacteriophage communities have so far been relatively limited.
Bacteriophages (phages) are the most abundant biological entities found on earth , and their importance has been highlighted in many habitats such as aquatic environments, soil, food manufacturing environments, and the gastrointestinal tract [6–9]. A number of recent publications have reported the diversity of bacteriophages within GM [9–11], the factors driving their dynamics [1, 12, 13], their influence on the GM structure [14–16], and their association with dysbiosis and human disease [17–19]. Recent advances in high-throughput sequencing (HTS) technologies have opened new opportunities for exploring phage diversity, as well as their evolution and influence on the structure of bacterial communities in the human gut. However, despite this, little attention has been given to the efficiency and optimization of current protocols dealing with extraction of fecal bacteriophages prior to metavirome studies.
Current protocols employed for extraction of viral particles from fecal and environmental samples incorporate variations over the following steps: (i) dissolution, (ii) centrifugation, (iii) sequential filtrations (either dead-end or tangential) used first for removing cells and then for concentrating viruses, and (iv) final purification by CsCl gradient ultracentrifugation [20–25]. In spite of this shared backbone, protocols differ from study to study, as different centrifuge speeds, filter cut-off values, and concentration techniques are being used. Recently, Hoyles et al.  reported that through the inclusion of 0.45-μm filters, instead of 0.22-μm cut-off filters, it is possible to double the yields of viral DNA isolated from human feces and cecal fluids. However, determining the efficiency of extraction of viral particles from fecal samples by (i) spiking with known phages, (ii) measuring the impact of extraction procedures on DNA purity (presence of bacterial and host derived DNA) for mining auxiliary metabolic genes, and (iii) profiling the metagenomic virome as measurements of true success have not been performed to date.
Different kits for extraction of DNA from fecal samples have been shown to influence community structure as determined by HTS [27, 28]. Further, it is well known that viruses differ in size and structure, so differences in extraction procedures are likely to influence the structure of the phage community and amount and quality of extracted DNA and further metagenomic analyses, which in turn could lead to inaccurate or biased conclusions . Further, it is important to point out that obtaining higher enrichments of viral particles will not only improve the yields of DNA but also give more accurate estimations of their concentration in stool samples (and the human gut). In addition, high phage yield will increase the probability of retaining relatively rare phages that otherwise would not be represented in the metagenomic dataset .
Here, we report two optimized protocols for extraction of bacteriophages from fecal samples prior to phage-metavirome studies. The optimization of protocols was based on two well-known laboratory techniques, tangential-flow filtration (TFF), and polyethylene glycol precipitation (PEG), using an adapted method from a published protocol  that has been used in several previously published studies [1, 9, 31] as control (literature-adapted protocol (LIT)). We analyzed the recovery rate of the three methods for a set of spiked bacteriophages (c2, ϕ29, and T4; representatives of the Siphoviridae, Podoviridae, and Myoviridae families, respectively), their efficiency to concentrate phage particles (PPs), and their impact on the phage community structure through metagenomic analyses. Our optimized methods not only yielded a higher number of PPs and DNA compared to the control but also demonstrated a higher efficiency for recovering spiked bacteriophages and increased the proportion of phage-derived sequences.
Strategy for optimization of protocols
The strategy for the optimization of protocols is illustrated in Fig. 1 and relied basically on minimizing the loss of known (spiked) bacteriophages at each step of extraction. Several approaches including different concentrations of NaCl and speeds of centrifugation/ultracentrifugation, dead-end and tangential-flow filtrations, two PPs concentration techniques (PEG and TFF), as well as four- and two-layer CsCl gradients (Fig. 1) were investigated. For convenience, the procedures for extraction were divided in two sections: the pre-processing (or part 1) was used to homogenize and remove large particles, whereas the purification section (or part 2) aimed at eliminating low molecular weight particles (impurities) and microbial cells and to concentrate phages. Representative populations of the Siphoviridae, Podoviridae, and Myoviridae bacteriophage families (a set of c2, ϕ29, and T4 phages) were inoculated to a concentration of 5.0 × 105 phages per sample (5 g of feces re-suspended in 45 ml of SM buffer). If losses of more than 50 % of the spiked phages were observed in a given step of extraction (data not shown), these were consequently diverged into a new extraction route until a higher recovery of the inoculated populations was reached (free of impurities and contaminating cells). Instead of applying a 0.22-μm cut-off for tangential-flow filtration as described in previous studies [9, 12, 20], the reference (LIT) method was coupled with a 0.45-μm membrane cut-off as recently suggested . The routes with the highest phage recovery, which eventually became the optimized protocols (PEG, route 5; TFF; route 8), were compared to the reference method (LIT, route 6)  based on their concentration of PPs (in the final phage concentrate collected from CsCl gradients) and yields of DNA. Each of the optimized routes of pre-processing and purification were assessed with fecal samples from three subjects and using two biological replicates as shown in Additional file 1.
Recovery of spiked phage-representatives
Throughout the entire extraction procedure, PEG and TFF (Fig. 2b, c) showed an enhanced efficiency by retrieving greater phage populations than the control method (Fig. 2a). Compared with LIT, the optimized procedures were able to recover 8–13 times, 13–36 times, and 286–324 times larger phage populations of c2, ϕ29, and T4, respectively. Using the TFF and PEG protocols, virtually no spiked phages were lost until the ultracentrifugation step (CsCl gradient), whereas the literature-adapted protocol (Fig. 2a) lost up to 90 % of the spiked phages at the tangential-flow filtration step using a 0.45-μm membrane cut-off filter. The infectivity of the spiked T4 phage was compromised during the ultracentrifugation step (CsCl gradients), which occurred independently from the two speeds of ultracentrifugation used (using 82,000 and 150,000×g, for the control and optimized procedures, respectively) and possibly due to tail alterations at this step. Nevertheless, by performing a T4-specific qPCR analysis , we observed that T4 DNA was equally detected in samples that were treated with and without DNAse I after ultracentrifugation and prior to DNA extraction (data not shown), thereby suggesting that in spite of the infectivity loss, the phage head remained undamaged and therefore retained the genetic material.
Quantification of PPs and yields of phage DNA
As determined by epifluorescence microscopy [20, 33], the abundance of phages determined per gram of wet material in samples extracted with LIT varied from 5.2 × 107 to 6.2 × 108 PPs, whereas for PEG and TFF the abundance ranged from 7.5 × 108 to 1.0 × 1010 and 2.6 × 108 to 1.1 × 1010 PPs, respectively (Fig. 3). A high variability in the number of PPs quantified within each extraction method was seen, but this was primarily an outcome of high interpersonal (fecal sample to fecal sample) variation. On average, the number of PPs extracted with LIT represented only 6 and 10 % of the PPs yields obtained with TFF and PEG, respectively. Furthermore, DNA yields (determined by fluorometry) obtained with LIT varied from 0.2 to 3.2 ng of DNA per gram of fecal material (Additional file 2) and resembled concentrations previously obtained  from 1 g of feces (~1–6 ng of DNA). In our optimized procedures, the quantity of DNA obtained ranged from 36 to 170 ng g−1 and from 13 to 58 ng g−1 in samples extracted with TFF and PEG, respectively.
Characterization of PPs by transmission electron microscopy
Phage characterization by transmission electron microscopy (TEM)  was carried out in samples collected from two subjects (samples X1-LIT, X1-PEG, X1-TFF, Y1-LIT, Y1-PEG, and Y1-TFF, see Additional file 1). For Y1-LIT and X1-LIT, no phage particles were observed by TEM as the low yield of PPs extracted were below the detection limit. For X1, the number of PPs extracted with our procedures was estimated to be 108 PPs g−1 of wet material, and various Myoviridae (including Peduovirinae) Siphoviridae and Podoviridae and putative unclassified phage morphotypes were seen as the most common viral families (see Fig. 4a, b). For the largest Myoviridae-like phages, unique long whiskers attached to the lower (tail-orientated) part of the capsids were detected. In principle, all phage morphotypes were observed with both TFF and PEG. However, in the latter procedure, a high number of very small Podoviridae-like or unclassified phages with capsid sizes smaller than 50 nm were observed (shown on bottom of Fig. 4b) that were not present in the corresponding TFF extraction (Fig. 4a). This result indicates that the PEG methodology might be superior to the TFF procedure with respect to the recovery of small-size phages (putatively not sufficiently retarded by the TFF filtration membrane). For subject Y1 (Fig. 5a, b), different phage types were again detected, one Myoviridae phage with long whiskers uniformly attached to the capsids, as well as long capsid-anchored whiskers for a tail-less Podoviridae-like phage were identified. Differences in phage population were observed again, as large Siphoviridae phages with distinct cross-striations on the tail were only present in the TFF procedure (Fig. 5a) (that apparently could not be collected by PEG precipitation, Fig. 5b). Our detailed TEM analysis confirmed that both TFF and PEG methods are suitable for recovery of intact phages of various morphotypes, as phage ghosts with empty capsids were only observed occasionally. It was furthermore verified that TFF and PEG samples were not contaminated with cells or cellular components (flagella, cell wall debris, etc.).
Metagenomic analysis of metaviromes
Diversity and composition of extracted metaviromes
Phage DNA sequencing libraries of replicates X1 and X2 (both extracted with LIT, PEG, and TFF; see Additional file 1) were prepared using the Illumina Nextera® XT kit and subjected to HTS (MiSeq Illumina platform). The number of pair-ended reads obtained by sequencing of six metaviromes was 43,676,130. After dataset pre-processing (trimming and discarding reads less than 50 nt), 41,180,458 reads remained. Since the number of sequences per metavirome varied between 6,107,696 and 8,457,587, these were subsampled to 5,000,000 sequences pr. virome (mean sequence length of 210 ± 64 nt).
Variations in the percentage of reads that matched with the NCBI Refseq complete viral genomes protein database (E value <10−3) were observed. The highest sequence affiliation (known fraction) was found in TFF metavirome (22.6 %), followed by PEG (19.7 %) and LIT (10.7 %). The levels of host and bacterial/eukaryotic DNA in metaviromes based on our optimized protocols were considerably lower (approximately twofold) than what was obtained using LIT (mapped within the unknown fraction or unaffiliated sequences). The percentage of sequences that matched with host DNA in metaviromes extracted with LIT and optimized procedures was >0.3 and 0.1 %, respectively, while for bacterial/eukaryotic DNA (based on mapping against human genome and 16S and 18S rRNA gene databases), >44 and <23 % was determined for LIT and optimized protocols, respectively (Additional file 3). The genetic diversity within the known and unknown fractions (after removing bacterial/eukaryotic and host associated DNA sequences) were screened using alpha-diversity estimates, computed with the UCLUST pipeline , and assessed through cluster richness analysis (75 % threshold sequence similarity) . On the rarefaction curves (Fig. 6), similar cluster richness (sequence diversity) was observed between our optimized procedures but slightly lower than LIT. Nevertheless all our observations resembled other phage alpha-diversity that have previously been reported in fecal samples .
Among the known fractions, the composition and abundance at the highest taxonomic level, including the most abundant orders and families, is summarized in Fig. 7. In all metaviromes (using 480,000 reads, accounting for 90 % of the number of affiliated sequences within the most indigent sample), the Caudovirales represented the vast majority of the affiliated reads, and these were significantly (p value <0.05) higher in PEG (88.2 %) and TFF (86.9 %) compared to LIT (79.6 %). The values are in agreement with results found in studies analyzing the viral diversity of the human gut [11, 12]. We did not attempt to relate the recovery of the spiked bacteriophages to HTS results but rather to evaluate the overall influence of the extraction procedures on the metagenomic dataset. Furthermore, amid the low abundant groups, the proportions of unclassified phages and unclassified dsDNA phages (many of them infecting gut-associated bacteria) were higher (p<0.05) in the optimized protocols (>5.6 %) as compared with LIT (4 %). Contrasting with the PEG and TFF, the metaviromes constructed with the LIT protocol had considerable higher relative abundance (p <0.05) of non-phage groups (viruses infecting eukaryotes), such as unclassified dsDNA viruses and non-Caudovirales groups.
Caudovirales structure community
The phages of the Caudovirales order are generally the most abundant viruses found in the human gut and perhaps one of the most influential entities modulating our GM . Based on their importance, we carried out a separate assessment of the reads belonging only to this viral order and the analysis revealed differences in the distribution of up to 21 bacteriophage subfamilies/genera (Table 1). The LIT procedure (22 %) showed a higher relative abundance of Myoviridae compared to the optimized procedures (PEG, 14 % and TFF, 16 %), which was partly given by a substantial increase of unassigned Myoviridae and Peduovirinae (Table 1). However, the metaviromes extracted with PEG and TFF obtained a larger proportion of the Siphoviridae family sequences (>32 %) than LIT (22 %), whose increase was particularly influenced by a boost of unclassified Siphoviridae phages.
Recent developments in HTS have generated new opportunities for investigating the diversity and impact of bacteriophages on the bacterial communities in the human gut and on human health as such [1, 17, 36–38]. However, little attention has been oriented towards the efficiency of current protocols dealing with the extraction of bacteriophages. To our knowledge, only one recent study has optimized methodologies to quantify viral particles and to describe their morphological attributes in steps prior to metagenomic studies , but the influence of different protocols on the recovery of different phage families through spiking and their overall impact on the phage community structure determined by HTS has not yet been investigated.
Here, we introduced a set of defined, known, and well-characterized bacteriophages (i.e., phages c2, ϕ29, and T4, respectively) that were monitored throughout all extraction steps and were used as representatives of the most common viral families found in the human gut. As our and other studies have demonstrated [9, 31], tangential-flow filtration is indeed a reliable and efficient technique for concentration of viral particles (e.g., 100 kDa filter) but also has limitations when it is employed towards removing large particles other than phages (i.e. microbial cells) as in the case of the 0.45-μm membrane cut-off (possibly due to filter clogging). Our results demonstrated that the recovery rate of the different bacteriophages is very variable and seems highly dependent on the method of extraction chosen. The optimized methods were at least eight times more efficient for c2 and ϕ29 phages and over 286 times more efficient for the T4 phage compared to the reference (LIT) procedure (Fig. 2). The infectivity of T4 (Myoviridae representative) was compromised in all protocols during the last step of purification using ultracentrifugation in CsCl gradients. Bourdin et al.  showed that various T4-like phages contract their tails and have a tendency to aggregate across the exposed tail tube during ultracentrifugation procedures. Corroborating those observations, qPCR-based quantification of T4 phage DNA showed that this loss in infective T4 phages is not an impediment for further recovery of their DNA as their phage heads are still intact protecting the phage DNA, which could be equally subjected to HTS and metagenome studies.
TFF and PEG protocols extracted between 6 to 10 times more PPs per gram of feces than the LIT protocol. Remarkable interpersonal variation (fecal sample to fecal sample) in the number (Fig. 3) as well as in diversity (e.g. Fig. 4 and Fig. 5) of PPs within each extraction procedure was observed. It can be speculated that this could be an effect of the metabolic state and health status of the individuals donating the fecal sample as these differences have been also observed in bacterial GM structure . DNA yields obtained through LIT did not exceed 3 ng g−1 wet sample and are in agreement with yields reported by Thurber et al. , from whom the LIT protocol has been adapted. The concentrations of DNA obtained using the reference (LIT) method were much lower than yields achieved with PEG and TFF (see Additional file 2).
The degree of bacterial/eukaryotic and host contamination in the phage DNA was considerably lower in the extractions performed with TFF and PEG. This was accomplished using centrifugation steps (at 5,000×g) to remove remaining bacterial cells/host DNA in the phage concentrate, rather than relying on the filtration system performed with 0.45-μm cut-off. Despite our low degree of contamination (in TFF and PEG), we do not rule out the idea that sequences identified as bacterial may have also been derived from active/induced prophages found within our purified PPs what could potentially lead to overestimate bacterial DNA levels. The genetic diversity of every metavirome was explored using cluster richness analysis . None of the rarefaction curves reached a plateau, meaning that the genetic diversity of the metaviromes was not completely covered. Metaviromes derived from PEG and TFF showed similar degree of sequence diversity with each other but slightly lower than the control method (Fig. 6). However, it must be pointed out that the optimization of our protocols was addressed towards recovering a higher yield of bacteriophages, specifically those PPs located in a density fraction ranging from 1.41 to 1.50 g ml−1 (see Fig. 1; adjusting the sample density to 1.39 g ml−1 allowed us to remove impurities from the phage concentrates). Following these optimized procedures, we increased the proportion of reads belonging particularly to bacteriophages (i.e., Caudovirales, unclassified dsDNA phages and unclassified phages, accounting for >92 % of the affiliated sequences) and reduced simultaneously the abundance of sequences associated with eukaryotic viruses (non-bacteriophage groups) (Fig. 7).
Taxonomic assignment and analysis were carried out using raw reads instead of assembled contigs. We chose this approach since contigs may reduce the diversity of the environmental samples, as unique reads of less abundant viruses are difficult to assemble . The percentage of sequences that matched with the NCBI Refseq complete viral genomes protein database was dependent on the extraction protocols: The reads of LIT-based metaviromes had the lowest affiliation (10.7 %), while PEG and TFF had a approximately twofold higher affiliation (19.7 and 22.6 %, respectively). Even though the proportion of known reads for all our metaviromes were similar to those reported in previous investigations [10, 36], the divergence in the percentage of affiliations observed between the adapted and optimized protocols may also have been a response associated with the density fraction recovered from the CsCl gradients (Fig. 1) (i.e., increasing the proportion of non-affiliated reads derived from eukaryotic viruses or non-phage groups).
The metaviromes derived from our optimized protocols yielded a larger proportion of reads classified as Siphoviridae (mostly unclassified siphophages) but a lower proportion of Myoviridae (including Peduovirinae and unassigned myophages) as compared with LIT (Table 1). This relative increase of Siphoviridae in the metagenomic dataset does not necessarily suggest a true loss of Myoviridae phages (as their relative distribution was higher in LIT) but rather indicate an enhanced recovery of the siphophages compared to the LIT method. Unlike the TFF and PEG procedures, in LIT, the 0.45-μm cut-off tangential-flow filtration demonstrated to have poor performance possibly due to filter clogging, thereby triggering a massive loss of PPs what may have negatively influenced the recovery of long-tailed bacteriophages (Siphoviridae).
Here, we report optimized and standardized methods that allow the extraction and recovery of bacteriophages from fecal samples in high yields. Our implemented strategy was focused on minimizing the loss of a spiked set of bacteriophages at the distinct steps of extraction, thereby generating higher yields of DNA prior to library preparation, reducing alterations in the phage DNA extracted and consequently the distribution of phage communities. The directness and robustness of our tested protocols make them suitable for any situation, where dissolution of the solid environmental sample/material is possible. Small adjustment and/or modification of steps, however, may be required according to the nature of the samples of interest.
Stool samples (anonomyzed) from three subjects were obtained from an intervention trial (3G—gut, grain and greens” carried out at the Department of Nutrition, Exercise and Sports (NEXS), University of Copenhagen. The study protocol was approved by the Ethical Committee of Copenhagen and Frederiksberg Municipalities and performed according to the Helsinki declaration. Stool samples were kept at 4 °C for maximum 24 h after voidance; hereafter, all samples were frozen and stored at −80 °C until use.
The strategy for optimization of protocols is described in Fig. 1. Five grams of stool were suspended in 45 ml of sterile SM (sodium chloride magnesium sulphate) buffer (200 mM NaCl, 10 mM MgSO4, 50 mM Tris-HCl (pH 7.5), 0.01 % gelatin) and homogenized in filter bags for 2 min at medium speed (Lab Seward, BA7021). As process control, the filtered fraction was spiked with c2 , ϕ 29 , and T4  phages, each to a concentration of 1.0 × 105 phages g−1 feces (Siphoviridae, Podoviridae, and Myoviridae families, respectively). Since these phages could naturally occur in fecal samples, prior to spiking, an aliquot of the filtrated fraction was used to confirm their absence by plaque assays. The concentration of NaCl were left as such (0.2 M contained initially in SM buffer) or adjusted to 0.5 and 1.0 M. Afterwards, the samples were centrifuged at 5,000, 8,000 and 11,000×g for 45 min at 4 °C (Fig. 1). The supernatants were recovered and pre-filtrated (dead-end filtration) using glass fiber filters (Millipore, AP2504200).
PPs enrichment routes
Polyethylene glycol precipitation approach (routes 1–5)
Samples were supplemented with 10 % PEG 6,000 (Millipore, 8074915000) [44, 45] and kept at 4 °C overnight after dissolution. PPs were sedimented at 4 °C using two speeds (15,000 and 25,000×g) and times (20 and 45 min) of centrifugation as shown in Fig. 1. The obtained pellets (PPs enriched) were re-suspended overnight in 7 ml of SM buffer at 4 °C. To remove low molecular weight contaminants, samples were dialyzed using a 14 kDa MWCO membrane (Sigma, D9277) in 350 ml of SM buffer at room temperature. This step was performed overnight and the SM buffer was exchanged once after 2 h of dialysis.
Tangential-flow filtration approach (routes 6–8)
The LIT was adapted from earlier protocols [9, 20, 31] with a slight modification using a larger pore size for micro-filtration . Briefly, samples were filtrated through a 0.45-μm cut-off membrane (Millipore, PXHVMPC50) and the retentate, holding residual cells and micro-particles, was continuously cycled in order to collect the maximum volume of permeate containing the PPs. The 0.45-μm permeates (LIT protocol; route 6) and also pre-filtrated samples (those from the TFF protocol; route 8) were passed through a 100-kDa MWCO membrane (Millipore, PXC100C50) to concentrate PPs and remove low molecular weight residuals. Unlike the former membrane, the PPs were concentrated in the retentate and therefore, its volume reduced down to 7 ml.
Purification of PPs with CsCl gradients
The purification of PPs using a four-layer CsCl gradient was performed according to procedures earlier suggested [9, 20, 31]. Samples were adjusted to a density of 1.15 g ml−1 in a final volume of 10 ml and loaded on top of a 6-ml step gradient, containing 2 ml of 1.35, 1.50, and 1.70 g ml−1 CsCl, respectively. These gradients were centrifuged at 82,000×g for 2 h at 4 °C, and the fractions between 1.36 to 1.50 g ml−1 were collected. For the two-layer CsCl gradient (optimized procedures), samples were brought up to a final volume of 14 ml. Their density adjusted to 1.39 g ml−1, loaded on top of 2 ml 1.7 g ml−1 CsCl, and centrifuged at 150,000×g for 4.5 h at 4 °C. Unlike the four-layer gradient, here, the density fraction comprised between 1.41 and 1.50 g ml−1 was recovered. All collected fractions were placed in sterile containers and stored at 4 °C until further use.
Epifluorescence and transmission electron microscopy
Epifluorescence microscopy was performed following procedures previously described [20, 33]. A small fraction of the phage concentrate (collected from CsCl gradients) was stained with 250× (final concentration) SYBR Gold solution (Invitrogen, S-11494), and PPs were visualized (100× Zeiss Plan-Neofluar) under blue excitation (~495 nm). PPs concentration was estimated using three images (Photometrics CoolSNAP cf) per analyzed sample. For phage quantification, pictures were processed with ImageJ 1.46 using similar parameters of analysis (particle analysis: size, 10–infinity; circularity, 0–1). Negative staining of samples with 2 % (w/v) uranyl acetate and subsequent TEM analyses were carried out as recently described by Hoyles et al.  with a Tecnai 10 transmission electron microscope (FEI Company, Eindhoven, the Netherlands) and a Megaview G2 CCD camera (Olympus SIS, Münster, Germany).
Extraction of phage DNA
The collected CsCl fractions were dialyzed overnight in autoclaved MilliQ water as described above and then centrifuged in Amicon Ultra filters (Millipore UFC203024) for 15 min at 4,000×g at 15 °C. The volume of each sample was then brought up to 450 μl (using sterile MilliQ water), and 50 μl of 10× DNase I buffer were added to a reach a final volume of 500 μl. To remove external DNA, samples were treated with 2.5 U ml−1 of DNase I and incubated at 37 °C for 1 h. The extraction of virions was carried out as described by Thurber et al. , 0.1 volumes of 2 M Tris HCl/0.2 M EDTA, 5 μl of 0.5 M EDTA (used for inactivation of DNase I), and 1 volume of formamide were added to each sample and incubated at room temperature for 30 min. DNA was spun down by adding 1 volume of 99.9 % ethanol and centrifuging at 14,000×g for 20 min at 4 °C; the pellet was washed twice using 70 % ethanol and re-suspended overnight in 537 μl of TE buffer at 4 °C. Subsequently, 30 μl of 10 % SDS and 3 μl of 20 mg ml−1 proteinase K were added and incubated for 1 h at 55 °C. DNA was washed and concentrated using DNA Clean & Concentrator (Zymo Research D4010) and eluted in 15 μl of elution buffer (the DNA binding buffer to sample ratio was 1:2), and the DNA concentrations were measured using Qubit® (dsDNA HS assay, Invitrogen).
Phage recovery efficiency
c2, ϕ29, and T4 phages (Siphoviridae, Podoviridae, and Myoviridae families, respectively) were quantified by plaque assay (top layer 0.4 % agarose; bottom layer 1.5 % agar) using Lactococcus lactis subsp. cremoris MG1363, Bacillus subtilis DSM 547, and Escherichia coli DSM613 as host strains, respectively. c2 titers were determined on M17 (5 g l−1 glucose (Merck, Germany)), ϕ29 on tryptic soy agar (TSA (Sigma-Aldrich, USA)), and T4 on LB media (per liter, 10 g peptone, 5-g yeast extract, 10 g NaCl, pH 7.2). All substrates were supplemented with 10-mM CaCl2 (Merck, Germany) and 5-mM MgCl2 (Merck, Germany). Phages (100 μl of serial dilutions) were mixed with host strains in the melted top layer (48 °C). Plates were incubated for 16–18 h at 28 °C for M17 and 37 °C for TSA and LB.
PP-associated DNA sequencing and data pre-processing
Libraries for DNA sequencing were prepared using the Illumina Nextera® XT DNA kit (San Diego, USA) following the manufacturer’s instructions. Library pooling and normalization were based on the product concentration of the final libraries determined by Qubit® (dsDNA HS). Tagged libraries were sequenced as part of a flow cell of 2 × 300 bp pair-ended MiSeq (Illumina, CA) sequencing. The raw reads were trimmed using CLC Genomic Workbench (CLC bio, Århus, DK, version 7.0.4) using a quality limit of 0.05, allowing zero nucleotides ambiguity. Subsequent analyses were carried out using pair-ended reads and sequences less than 50 nt in length were discarded.
Analysis of the metagenomic dataset
The composition of the extracted metaviromes was determined using raw paired-end reads using the MetaVir web server (http://metavir-meb.univ-bpclermont.fr; ). Since Metavir has limitations analyzing metaviromes containing more than 2,500,000 sequences, the non-assembled datasets (metavirome) were first adjusted to 5,000,000 reads and then split in two sections (2,500,000 sequences each). Composition and relative abundance generated by Metavir were converted into operational taxonomic units (OTU) tables, and subsequent analyses were performed in QIIME (1.7.0 and 1.8.0)  . Metaviromes were subsampled using 90 % of the number of sequences within the most indigent sample (multiple rarefactions computed with 1,000 OTU tables). The differences in the relative abundance of phage-taxa (dependent variable) given by paired-comparison between extraction procedures (independent variable) were assessed using one-way ANOVA. The Kraken program  was used to map host (human genome NCBI release 71) and bacterial (Kraken Minidatabase, as of 01.08.2015) derived DNA. To get better estimations of uncultured bacteria/eukaryotes, which may not be represented in the previous database, the unclassified reads were also blasted (UBLAST algorithm , using -evalue 1e-3, -id 0.7, -query_cov 0.6), against representative sequences of 16S (GreenGenes ) and 18S rRNA (Silva ) gene databases. Cluster richness analysis  was carried out with the UCLUST pipeline  using a 75 % sequence similarity threshold (2,000,000 reads were used in each metavirome). Rarefactions curves were computed using 10 subsampled cluster tables and expressed as number of clusters for each metavirome.
Availability of supporting data
All metaviromes are available through the European Nucleotide Archive (ENA) under the accession number ENA: PRJEB8354 in http://www.ebi.ac.uk/ena/data/view/PRJEB8354.
polyethylene glycol precipitation
transmission electron microscopy
Reyes A, Haynes M, Hanson N, Angly FE, Heath AC, Rohwer F, et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature. 2010;466:334–8.
Tremaroli V, Bäckhed F. Functional interactions between the gut microbiota and host metabolism. Nature. 2012;489:242–9.
Musso G, Gambino R, Cassader M. Interactions between gut microbiota and host metabolism predisposing to obesity and diabetes. Annu Rev Med. 2011;62:361–80.
Nielsen DS, Krych L, Buschard K, Hansen CHF, Hansen AK. Beyond genetics. Influence of dietary factors and gut microbiota on type 1 diabetes. FEBS Lett. 2014;588:4234–43.
Chibani-chennoufi S, Bruttin A, Dillmann M, Bru H. Phage-host interaction: an ecological perspective. J Bacteriol. 2004;186:3677–86.
Roux S, Enault F, Robin A, Ravet V, Personnic S, Theil S, et al. Assessing the diversity and specificity of two freshwater viral communities through metagenomics. PLoS One. 2012;7, e33641.
Vos M, Birkett PJ, Birch E, Griffiths RI, Buckling A. Local adaptation of bacteriophages to their bacterial hosts in soil. Science. 2009;325:833.
Verreault D, Gendron L, Rousseau GM, Veillette M, Massé D, Lindsley WG, et al. Detection of airborne lactococcal bacteriophages in cheese manufacturing plants. Appl Environ Microbiol. 2011;77:491–7.
Breitbart M, Hewson I, Felts B, Mahaffy JM, Nulton J, Salamon P, et al. Metagenomic analyses of an uncultured viral community from human feces. J Bacteriol. 2003;185:6220–3.
Kim M-S, Park E-J, Roh SW, Bae J-W. Diversity and abundance of single-stranded DNA viruses in human feces. Appl Environ Microbiol. 2011;77:8062–70.
Waller AS, Yamada T, Kristensen DM, Kultima JR, Sunagawa S, Koonin EV, et al. Classification and quantification of bacteriophage taxa in human gut metagenomes. ISME J. 2014;8:1391–402.
Breitbart M, Haynes M, Kelley S, Angly F, Edwards R a, Felts B, et al. Viral diversity and dynamics in an infant gut. Res Microbiol. 2008;159:367–73.
Minot S, Bryson A, Chehoud C, Wu GD, Lewis JD, Bushman FD. Rapid evolution of the human gut virome. Proc Natl Acad Sci U S A. 2013;110:12450–5.
Mills S, Shanahan F, Stanton C, Hill C, Coffey A, Ross RP. Influence of bacteriophages in shaping the mammalian gut microbiota. Gut Microbes. 2013;4:4–16.
Minot S, Wu GD, Lewis JD, Bushman FD. Conservation of gene cassettes among diverse viruses of the human gut. PLoS One. 2012;7, e42342.
Reyes A, Wu M, McNulty NP, Rohwer FL, Gordon JI. Gnotobiotic mouse model of phage-bacterial host dynamics in the human gut. Proc Natl Acad Sci U S A. 2013;110:20236–41.
Lepage P, Colombet J, Marteau P, Sime-Ngando T, Doré J, Leclerc M. Dysbiosis in inflammatory bowel disease: a role for bacteriophages? Gut. 2008;57:424–5.
Pérez-Brocal V, García-López R, Vázquez-Castellanos JF, Nos P, Beltrán B, Latorre A, et al. Study of the viral and microbial communities associated with Crohn’s disease: a metagenomic approach. Clin Transl Gastroenterol. 2013;4, e36.
Norman JM, Handley SA, Baldridge MT, Droit L, Liu CY, Keller BC, et al. Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell. 2015;160:447–60.
Thurber RV, Haynes M, Breitbart M, Wegley L, Rohwer F. Laboratory procedures to generate viral metagenomes. Nat Protoc. 2009;4:470–83.
Hall RJ, Wang J, Todd AK, Bissielo AB, Yen S, Strydom H, et al. Evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery. J Virol Methods. 2014;195:194–204.
Iker BC, Bright KR, Pepper IL, Gerba CP, Kitajima M. Evaluation of commercial kits for the extraction and purification of viral nucleic acids from environmental and fecal samples. J Virol Methods. 2013;191:24–30.
Wommack KE, Williamson KE, Helton RR, Bench SR, Winget DM. Methods for the isolation of viruses from environmental samples. In: Clokie M, Kropinski A, editors. Methods in molecular biology. New York, USA: Humana Press; 2009. p. 3–14.
Pelzek AJ, Schuch R, Schmitz JE, Fischetti VA. Isolation of bacteriophages from environmental sources, and creation and functional screening of phage DNA libraries. In: Gallagher SR, Wiley EA, Hoboken, editors. Current protocols essential laboratory techniques. New Jersey, USA: John Wiley & Sons, Inc; 2013. p. 13.3.1–13.3.31.
Kleiner M, Hooper LV, Duerkop BA. Evaluation of methods to purify virus-like particles for metagenomic sequencing of intestinal viromes. BMC Genomics. 2015;16:7.
Hoyles L, McCartney AL, Neve H, Gibson GR, Sanderson JD, Heller KJ, et al. Characterization of virus-like particles associated with the human faecal and caecal microbiota. Res Microbiol. 2014;165:803–12.
Wesolowska-Andersen A, Bahl MI, Carvalho V, Kristiansen K, Sicheritz-Pontén T, Gupta R, et al. Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomic analysis. Microbiome. 2014;2:19.
Dollive S, Peterfreund GL, Sherrill-Mix S, Bittinger K, Sinha R, Hoffmann C, et al. A tool kit for quantifying eukaryotic rRNA gene sequences from human microbiome samples. Genome Biol. 2012;13:R60.
Sachsenröder J, Twardziok S, Hammerl J a, Janczyk P, Wrede P, Hertwig S, et al. Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing. PLoS One. 2012;7:e34631.
Daly GM, Bexfield N, Heaney J, Stubbs S, Mayer AP, Palser A, et al. A viral discovery methodology for clinical biopsy samples utilising massively parallel next generation sequencing. PLoS One. 2011;6, e28879.
Zhang T, Breitbart M, Lee WH, Run J-Q, Wei CL, Soh SWL, et al. RNA viral community in human feces: prevalence of plant pathogenic viruses. PLoS Biol. 2006;4, e3.
Fittipaldi M, Pino NJ, Codony F, Adrados B, Peñuela GA, Morató J. Discrimination of infectious bacteriophage T4 virus by propidium monoazide real-time PCR. J Virol Methods. 2010;168:228–32.
Patel A, Noble RT, Steele JA, Schwalbach MS, Hewson I, Fuhrman JA. Virus and prokaryote enumeration from planktonic aquatic environments by epifluorescence microscopy with SYBR Green I. Nat Protoc. 2007;2:269–76.
Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–1.
Dutilh BE, Cassman N, McNair K, Sanchez SE, Silva GGZ, Boling L, et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat Commun. 2014;5:1–11.
Minot S, Sinha R, Chen J, Li H, Keilbaugh S a, Wu GD, et al. The human gut virome: inter-individual variation and dynamic response to diet. Genome Res. 2011;21:1616–25.
Barr JJ, Auro R, Furlan M, Whiteson KL, Erb ML, Pogliano J, et al. Bacteriophage adhering to mucus provide a non-host-derived immunity. Proc Natl Acad Sci U S A. 2013;110:10771–6.
Kot W, Vogensen FK, Sørensen SJ, Hansen LH. DPS—a rapid method for genome sequencing of DNA-containing bacteriophages directly from a single plaque. J Virol Methods. 2014;196:152–6.
Bourdin G, Schmitt B, Marvin Guy L, Germond J-E, Zuber S, Michot L, et al. Amplification and purification of T4-like escherichia coli phages for phage therapy: from laboratory to pilot scale. Appl Environ Microbiol. 2014;80:1469–76.
Walujkar SA, Dhotre DP, Marathe NP, Lawate PS, Bharadwaj RS, Shouche YS. Characterization of bacterial community shift in human Ulcerative Colitis patients revealed by Illumina based 16S rRNA gene amplicon sequencing. Gut Pathog. 2014;6:22.
Lubbers MW, Waterfield NR, Beresford TP, Le Page RW, Jarvis AW. Sequencing and analysis of the prolate-headed lactococcal bacteriophage c2 genome and identification of the structural genes. Appl Environ Microbiol. 1995;61:4348–56.
Meijer WJ, Horcajadas JA, Salas M. Phi29 family of phages. Microbiol Mol Biol Rev. 2001;65:261–87.
Miller ES, Kutter E, Mosig G, Arisaka F, Kunisawa T, Rüger W. Bacteriophage T4 genome. Microbiol Mol Biol Rev. 2003;67:86–156.
Mahony J, Kot W, Murphy J, Ainsworth S, Neve H, Hansen LH, et al. Investigation of the relationship between lactococcal host cell wall polysaccharide genotype and 936 phage receptor binding protein phylogeny. Appl Environ Microbiol. 2013;79:4385–92.
Lawrence JE, Steward GF. Purification of viruses by centrifugation. In: Wilhelm SW, Weinbauer MG, Suttle, CA, editors. Waco, TX, USA: American Society of Limnology and Oceanography 2010. p. 166–181.
Roux S, Tournayre J, Mahul A, Debroas D, Enault F. Metavir 2: new tools for viral metagenome comparison and assembled virome analysis. BMC Bioinformatics. 2014;14:76.
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.
Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15:R46.
McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2012;6:610–8.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–6.
This project was supported by the Counteracting Age-related Loss of Skeletal Muscle (CALM) project (University of Copenhagen Excellence Programme for Interdisciplinary Research) and the MetaPhageLAB, from the University of Copenhagen (FTP project no. 11–106991). Special thanks to Lukasz Krych for his advice in bioinformatics and Angela Back for her technical help in TEM analysis. JLC-M is supported by CALM and MKM by MetaPhageLAB. The Danish Stratetic Research Council funded project “3G—gut, grain and greens” is acknowledged for making fecal samples available.
The authors declare that they have no competing interests.
JLC-M carried out most of the experiments, conceived steps for optimization, carried out metagenomic and statistical analyses, and drafted the manuscript. MKM conceived the implementation of a two-layer CsCl. WK and LHH carried out DNA sequencing and helped to draft the manuscript. HN and CF performed transmission electron microscopy analyses and wrote parts of the manuscript. FKV and DSN conceived the study and experimental design and wrote parts of the manuscript. All authors contributed to and approved the final version of the manuscript.
Description of samples (per subject) used for evaluation of the optimized routes for pre-processing and purification of fecal bacteriophages for virome characterization. The first biological replicates/extractions are represented by X1, Y1 and Z1, while the second one by X2, Y2 and Z2 (each of them were extracted with LIT, PEG and TFF for a total of 18 extractions). All samples and replicates were used for analysis of spiked phage recovery (c2, ϕ29 and T4 phages) and quantification of PPs/DNA. TEM analyses were only carried out with replicates obtained from extractions X1 and Y1 (carried out with LIT, PEG and TFF). The construction of metaviromes was performed using the bacteriophages obtained from Subject X and extracted with LIT, PEG and TFF. (874 KB)
Summary of DNA concentrations obtained from the PPs extracted with optimized and adapted methods. The concentrations of DNA were determined by fluorometry. Values shown represent the measurements of three fecal samples using two biological replicates. (1.08 MB)
Metaviromes reads associated with human and bacterial genomes. Percentage of reads are based on 5,000,000 sequences per phage metavirome. (55.7 KB)
About this article
Cite this article
Castro-Mejía, J.L., Muhammed, M.K., Kot, W. et al. Optimizing protocols for extraction of bacteriophages prior to metagenomic analyses of phage communities in the human gut. Microbiome 3, 64 (2015). https://doi.org/10.1186/s40168-015-0131-4
- Quantification of bacteriophages
- Extraction procedures
- Gut microbiome
- Phage metavirome