Sputum DNA sequencing in cystic fibrosis: non-invasive access to the lung microbiome and to pathogen details
© The Author(s). 2017
Received: 1 August 2016
Accepted: 24 January 2017
Published: 10 February 2017
Cystic fibrosis (CF) is a life-threatening genetic disorder, characterized by chronic microbial lung infections due to abnormally viscous mucus secretions within airways. The clinical management of CF typically involves regular respiratory-tract cultures in order to identify pathogens and to guide treatment. However, culture-based methods can miss atypical or slow-growing microbes. Furthermore, the isolated microbes are often not classified at the strain level due to limited taxonomic resolution.
Here, we show that untargeted metagenomic sequencing of sputum DNA can provide valuable information beyond the possibilities of culture-based diagnosis. We sequenced the sputum of six CF patients and eleven control samples (including healthy subjects and chronic obstructive pulmonary disease patients) without prior depletion of human DNA or cell size selection, thus obtaining the most unbiased and comprehensive characterization of CF respiratory tract microbes to date. We present detailed descriptions of the CF and healthy lung microbiome, reconstruct near complete pathogen genomes, and confirm that the CF lungs consistently exhibit reduced microbial diversity. Crucially, the obtained genomic sequences enabled a detailed identification of the exact pathogen strain types, when analyzed in conjunction with existing multi-locus sequence typing databases. We also detected putative pathogenicity islands and indicators of antibiotic resistance, in good agreement with independent clinical tests.
Unbiased sputum metagenomics provides an in-depth profile of the lung pathogen microbiome, which is complementary to and more detailed than standard culture-based reporting. Furthermore, functional and taxonomic features of the dominant pathogens, including antibiotics resistances, can be deduced—supporting accurate and non-invasive clinical diagnosis.
KeywordsWGS metagenomic sequencing Cystic fibrosis Sputum COPD Lung metagenome
Cystic fibrosis (CF) is one of the most prevalent genetic disorders in the Caucasian population, affecting about one in 2500 newborns . This autosomal recessive condition affects mostly secretory organs, such as the pancreas, liver, and lungs. CF is caused by mutations in the Cystic Fibrosis Transmembrane conductance Regulator (CFTR) gene, whose protein product is involved in the transport of chloride ions across the apical membrane of epithelial and blood cells. Loss of CFTR protein function causes thickened extracellular mucus to accumulate, which impairs mucociliary clearance in the airways . CF prominently leads to microbial pathogen colonization in the lung, followed by recurrent pulmonary infection and chronic inflammation . Treatment options exist, including mechanical and enzymatic mobilization of mucus, drug therapy to improve residual CFTR function , antibiotic therapy to reduce pathogen load, anti-inflammatory drugs, and lung transplantations. Nevertheless, for the majority of patients, the condition leads to progressive pulmonary damage and eventually respiratory failure and death.
The CF lungs are colonized by a number of pathogenic bacteria, commonly including Staphylococcus aureus, Pseudomonas aeruginosa, Haemophilus influenzae, and Burkholderia cepacia . While prompt and aggressive antibiotic therapies can often control infections, prolonged antibiotic treatments may favor the emergence of antibiotic resistances and can facilitate colonization by multidrug-resistant pathogens such as Achromobacter xylosoxidans and Stenotrophomonas maltophilia [6, 7]. Currently, culture-based techniques are routinely employed to identify and classify lung pathogens, often using selective culture media designed for specific groups of pathogens . However, the culture conditions and procedures are necessarily biased towards known, previously encountered pathogens—whereas novel, slow-growing or rare microbes might potentially be missed (e.g. atypical mycobacteria). Meanwhile, the taxonomic identification of observed pathogens often has limited resolution, and the physiology and resistance profiles of the colonies are not backed up using genomic information. Lastly, the “background” community—opportunistic or accidental members of the lung microbiome—is not routinely studied for clinical use [9, 10], despite its potential to harbor antibiotic resistance genes and to elicit or modulate immune responses.
Culture-independent, genomic sequencing techniques offer potential alternatives for identifying pathogens and opportunistic colonizers and for guiding therapeutic decision-making. However, such methods are not routinely applied for CF management. In research settings at least, what has been used most frequently are PCR-based surveys of the 16S ribosomal RNA gene [11, 12], which are however of limited taxonomic and functional resolution.
Here, we develop a pragmatic approach that aims to maximize molecular information, while minimizing patient discomfort and risk exposure. To achieve this, we sequence DNA from non-invasive sputum samples, without prior removal of host DNA and without complex enrichment or depletion protocols. Forgoing host DNA depletion yields a substantial fraction of sequence reads that are of human origin, and there will be contributions from the upper respiratory tract and oral cavity , but the simplicity of the approach has the unique advantage of providing an unbiased, comprehensive, and reproducible set of reads from the lower respiratory tract as well. There have been only few studies so far that took a somewhat similar approach [14–18], each with slight differences in sample processing, sequence analysis, and focus.
For the current study, we collected and sequenced sputum samples from the following categories: adult and pediatric CF patients (6 samples), chronic obstructive pulmonary disease (COPD) patients (4 samples), and healthy individuals (7 samples). Using whole-genome shotgun sequencing, we observed several known pulmonary pathogens whose genome coverage routinely exceeded 95%, allowing us to type the strains with very high precision using existing multi-locus sequence typing (MLST) databases. Patient-specific differences from reference strains are noted and discussed. Using the workflow we established, a detailed genomic profile can be generated for any lung pathogen from which sufficient DNA can be prepared, including also potentially unknown pathogens.
Results and discussion
Sputum samples contain diagnostically useful DNA
The observed DNA concentrations in sputum varied across subject groups, with CF and COPD patients presenting on average higher DNA concentrations than the healthy controls (Fig. 1b). In all samples, a large fraction of sequenced DNA (>90%) was of human origin (Fig. 1c). This agrees with previous studies, which suggested that free human DNA in the lung stems from disintegrating inflammatory cells such as neutrophils that release DNA into sputum, particularly in the case of the diseased lung tissue, leading to an increased sputum viscosity [19, 20]. Conversely, healthy subjects produced smaller volumes of sputum with lower DNA concentrations, but their residual DNA was also mostly of human origin (Additional file 2: Figure S2).
Lung microbiota composition varies strongly across individuals
After a limited (conservative) assembly of the non-human sequence fraction into contigs, approximately half of the assembled nucleotides could be assigned a taxonomic identity through homology searches, irrespective of subject groups (Fig. 1d). For healthy subjects, we found an overall higher diversity (average Shannon entropy of 3.07 in the healthy non-smoker lungs, versus 1.08 in the CF lungs, p < 0.038). Interestingly, the diversity was found to be reduced not only in the sputum of CF and COPD patients but also to some extent in smokers (Additional file 2: Figure S3). The most abundant taxa in healthy subjects were Prevotella, Streptococcus, Veilonella, Haemophilus, and Neisseria (Fig. 1e). Previous studies have indicated that the healthy lung does not harbor a stable and specific microbiome, but rather a mixture of microbes from the upper respiratory tract and oral cavity [21–24]. In agreement, many of the microbes we identified in healthy subjects corresponded to known oral (and occasionally also gastrointestinal or vaginal) flora.
In contrast, the microbiome composition in CF subjects was highly variable and distinct from standard oral microbiomes. Each patient harbored a unique community, often dominated by one or a few principle colonizers/pathogens such as Pseudomonas, Staphylococcus, Stenotrophomonas, or Achromobacter. (Fig. 1e, Additional file 2: Figure S4). Compositionally, the CF flora only marginally overlapped with those of the healthy and smoker populations.
Observed microbiota in CF subjects match clinical diagnosis
For five of the six CF subjects, all bacterial pathogens identified in the clinical culture diagnosis were also identified through the DNA sequencing, each with at least 10,000 bp mapped to their genome (Additional file 4: Table S2). For example, we confirmed the presence of the typical CF pathogens Achromobacter and Staphylococcus in the sample CF-85 (for all subjects, see Additional file 5: Table S3). In this particular patient, we also observed multiple instances of antibiotic resistance genes, including genes annotated to potentially diminish effectiveness of the ongoing antibiotic treatment (see below, “Prediction of antibiotic resistances and fitness-conferring mutations”). We also observed Prevotella, Neisseria, Streptococcus, Haemophilus, and others in lower abundances. These latter genera are typically viewed as commensals, but it cannot be ruled out that they contribute to pathogenesis as well. In pediatric CF patient CF-00, we detected Pseudomonas, a multidrug-resistant (MDR) Stenotrophomonas, and several other common microbial inhabitants. In contrast, the microbiomes of adult CF patients CF-82 and CF-76 were primarily dominated each by a single pathogen, Pseudomonas, with very low relative abundances of others, such as Streptococcus. Here and in similar “consolidated” situations, competition between microbes may have suppressed diversity .
Patients CF-99 and CF-94 exhibited a microbial community largely dominated by Staphylococcus and Streptococcus, respectively. These were two of the patients for which antibiotic treatment was ongoing; use of antibiotics is thought to correlate inversely with diversity and to instigate significant changes in bacterial community structure, especially in younger patients that harbor a relatively rich and susceptible microbial community as compared to older patients that often develop resilient communities [26–30]. Hence, in patients CF-94 and CF-99, lack of a diverse microbial community is consistent with the ongoing treatment at the time of sample collection (Additional file 1: Table S4 and Additional file 3: Table S5); in this case, no resistances against the administered antibiotics had been detected in clinical testing.
The samples from COPD patients reveal intermediate microbial complexity
We analyzed the microbiome diversity for each COPD patient and found that, of the four samples collected, CD-47 had the largest and most diverse population, closely resembling the composition of a healthy microbiome. Subject CD-34 was unique in that it was the sole sample in which we detected significant amounts of a virus, Herpes simplex virus. This virus was covered deeply enough to be partially assembled and was seen against a background of Streptococcus, Rothia, and Haemophilus, with Fusobacterium and Prevotella greatly reduced (Additional file 2: Figure S4). Overall, the samples from COPD patients were somewhat more difficult to characterize. On several occasions, we failed to detect eukaryotic genera that had been observed in the clinical culture-based diagnostics. For example, in patient CD-42, we were able to reliably detect the presence of several bacterial community members (confirmed by clinical microscopy results) but were unable to confirm the presence of Candida. Since eukaryotic genomes tend to be larger and more challenging to assemble, they may sometimes fail to be detected in sufficient numbers in our approach.
Entropy landscapes allow the detection of clonally expanded pathogens
In Fig. 2, entropy landscapes are used to visualize the lung community composition of two CF patients with distinct dominant pathogens, as well as one representative sample each from COPD, smoker, and non-smoker groups. For CF-00, the plot shows two likely clonal overgrowths with distinct GC content (Fig. 2a), suggesting chronic infections by two distinct pathogen species. Indeed, annotation revealed these contig groups to consist exclusively of members of the genera Stenotrophomonas and Haemophilus, respectively (Fig. 2b).
These observed landscapes are characteristic of microbial communities with one or a few dominant members that have grown clonally to occupy a sizeable proportion of their niche. In contrast, the healthy and smoker groups generally were not dominated by few clonal species, as reflected in the absence of clustered low-entropy contigs (see Additional file 2: Figure S5 for plots of each subject sampled).
The entropy landscapes allow the visual separation of likely oral contaminants and low-abundance colonizers from the clonal pathogen(s) growing chronically. Furthermore, any non-annotated contigs that visually cluster within the pathogen contigs may indicate undocumented genomic regions, which would have been recently introduced into the pathogen genome and may not be known from reference strains in databases. Hypothetically, even atypical pathogens that are not yet annotated in any database would become discernable, although we have not encountered such a case among our samples.
Multi-locus sequence typing of pathogens using unbiased sputum sequences
Multi-locus sequence typing (MLST) is a well-established method to characterize isolates of a given microbial species in the context of previously observed strains of that species, via DNA sequencing of a limited, pre-defined group of diagnostic genes . Traditional MLST requires the isolation and culture of microbes of interest, followed by specific PCR assays targeting the genes used for strain typing. In contrast, by using WGS sequencing data, we omit these steps and directly proceed to characterizing strains of interest from the mixture. Importantly, there is no need to decide, ahead of the experiment, which strains are to be typed since no specific PCR is required. As long as a species or genus has been previously subjected to MLST genotyping (i.e., a well-populated MLST database with corresponding marker genes is available), it can be characterized. In the following, we describe two examples of MLST strain genotyping in the sputum of CF patients—one yielding a previously observed strain from a well-sampled strain collection, the other yielding a more exotic strain for which even the exact species designation remains unclear (it may belong to a new, as yet unnamed species).
Next, we observed and characterized a putative Achromobacter strain from patient CF-85. Members of the Achromobacter genus form a group of gram negative, strictly aerobic, motile bacteria of which more than 10 species are currently known. The majority of strains isolated from CF patients belong to the species A. xylosoxidans, which is also an intrinsically multidrug-resistant opportunistic pathogen . Aside from the patients, strains can also be found in a variety of aquatic environments ranging from moist soils to dialysis solutions . Achromobacter infections have been generally observed in older patients with pulmonary diseases, but their implication in deteriorating lung function has remained unclear [40, 41]. Accurate identification and discrimination of different Achromobacter species has been a challenging task due to limited taxonomic delineation. A recent study revealed that several commercial test systems used in different diagnostic laboratories were unable to distinguish different Achromobacter species infecting CF patients and would often identify them incorrectly as A. xylosoxidans .
Using the appropriate gene set for this genus, we again performed MLST. In this case, we did not find any matches to previously documented strains. Instead, we placed our sequences on a phylogenetic tree encompassing the entire genus, constructed using concatenated housekeeping genes from all species and type strains available in the PubMLST database (http://pubmlst.org/achromobacter/)  (Fig. 3b and see “Methods”). Our observed strain did not cluster with other identified strains, with the exception of a single unnamed and uncharacterized isolate from another CF case. The closest neighbors of these two strains in the tree were A. marplatensis and A. pulmonis. Although routine clinical analysis using microscopy had identified the strain as A. xylosoxidans, our phylogenetic analysis suggested otherwise; the two sequences were sufficiently removed from A. xylosoxidans to suggest a novel but previously unidentified clade. Independent studies  have also provided evidence to support the presence of as-yet unnamed and uncharacterized species in the Achromobacter genus, responsible for CF infections in patients. Further species divisions in this clinically relevant but undersampled genus are needed, and patient-derived genomes such as ours might provide valuable context.
Prediction of antibiotic resistances and fitness-conferring mutations
Chronic colonizers can adapt to their host environment to sustain themselves under varying selection pressures, such as antibiotic treatments or the presence of potentially competing co-infections. This is particularly problematic in the case of infections caused by antibiotic-resistant bacteria in CF patients. For example, a five-fold increase in MRSA infections has been observed in the past 15 years , and 18.1% of P. aeruginosa infections in a population of primarily young adult CF patients were reported to be MDR . These pathogens acquire various resistance mechanisms to therapeutic agents including altered membrane permeability, efflux pumps, or induced enzymatic modifications. Isolates with identical resistance patterns sometimes exhibit different genetic modifications, indicating that these bacteria can use distinct strategies to respond to similar environmental pressures .
The predictions were validated independently by clinical resistance reports, where the majority of both “resistant” as well as “non-resistant” calls were confirmed (Fig. 4b). False predictions were limited to false negatives, i.e., clinically observed resistances which were not predicted based on sequence analysis. Interestingly, we additionally observed a virulence gene, mprF, annotated as providing resistance against naturally secreted antimicrobial peptides known as defensins. These cationic peptides are largely secreted by neutrophils and by the airway epithelium in the CF lungs to protect the epithelia against infections. S. aureus strains with resistance to defensins show a greater pathogenic potential . Thus, the presence of such virulence factors is of general relevance to clinicians when designing treatments for CF patients.
Apart from antibiotic resistances, other phenotypes such as biofilm formation or exo-polysaccharide secretion may also be actively adapting in chronic colonizers. Indeed, it is known that chronic colonizers develop considerable genomic heterogeneity, which is perhaps maintained by specialization or balanced selection [48, 49]. We conducted a systematic search for genomic heterogeneity using the FreeBayes tool and observed varying levels of sequence variation (see Additional file 6: Table S7 and “Methods”). With our genomic read coverage often below 10×, we are somewhat limited in power, but a conservative search shows that two of the pathogens are nearly clonal, with less than 20 variants observed for each (Staphylococcus in patient CF-85 and Pseudomonas in patient CF-76). In contrast, the remaining nine pathogens tested show considerable heterogeneity, with hundreds of sequence variants observed in each strain (Additional file 6: Table S7). In the case of P. aeruginosa, this kind of sequence heterogeneity has been particularly well studied, and a set of 52 genes has been shown to be likely adaptive inside the lung . Of these, we indeed found five with SNPs in one of our patients, CF-82. As an example, we highlight the mucA gene, which controls the mucoid/non-mucoid phenotype. Initial isolates of P. aeruginosa from CF patients are generally non-mucoid and responsive to antibiotics. However, during protracted infections, these pathogens start overproducing an exo-polysaccharide known as alginate which is a polymer of α-D-manuronic acid and L-glucuronic acid , eventually leading to their transition into a mucoid phenotype. Mucoid P. aeruginosa is immune to several antibiotics and to phagocytosis . Correspondingly, the mucoid phenotype is directly linked with poor clinical outcome for patients. According to the clinical laboratory report, the sputum of CF-82 harbored both mucoid and non-mucoid P. aeruginosa strains. To better understand the underlying genetic modifications that lead to this phenotypic transition, we inspected the mucA gene, which encodes for a trans-membrane σ-factor responsible for limiting expression of the 12-gene alginate operon (algA-algD). Loss of function mutations in the mucA gene typically result in production of alginate, in turn giving rise to a mucoid phenotype. In CF-82, we indeed identified 7 sequence reads showing a single-nucleotide deletion at position 429 in the mucA gene (Fig. 4c), leading to a truncated and presumably non-functional protein. In contrast, 11 reads supported the presence of a non-mutated, fully functional protein; in combination, these reads confirm and explain the clinical observation.
Genome comparisons reveal patient-specific pathogen features
We next made gene predictions for all the unique genomic regions of size 1 kb or greater present in the reconstructed genome. Interestingly, we found a large region of 23 genes (Fig. 5b), 14 of which were found to code for a virulence-associated type VI secretion system (T6SS). T6SS systems typically consist of a conserved cluster of 13 core genes, 10 of which were observed in our gene set together with 4 additional non-conserved T6SS accessory genes (Fig. 5c) responsible for post translational regulation based on orthology predictions. T6SS secretion systems (Fig. 5d) were first described a decade ago in Vibrio cholerae  and since then have been studied in several gram-negative bacteria, including P. aeruginosa . They allow the secretion of a range of substrates such as toxins, adhesins, hydrolytic enzymes, and effector proteins and have been classified into four subtypes . T6SS have been associated with biofilm formation and antagonistic or bactericidal functions towards competing bacterial species . They are frequently observed in genomic islands, and their presence shows little correlation to bacterial taxonomy, suggesting that they are frequently acquired through horizontal gene transfer .
In the past decade, T6SS have repeatedly been reported to be absent in all S. maltophilia isolates, both in clinical and environmental isolates [57–59]. Its presence in our clinical isolate thus highlights its outstanding ability to adapt to new hosts and surroundings. In addition to our observation, recent observations in new S. maltophilia strains  were also supportive of T6SS systems (although the genomes are yet to be published).
The spread of such virulent secretory systems in multidrug-resistant CF pathogens provides a unique opportunity to identify new targets for designing innovative treatments. Anti-virulence drugs  that target specific secretion systems to disarm the bacteria can be a likely alternative to conventional antibiotics. Thus, whole genome reconstruction of highly abundant bacteria in the patient samples provides an exclusive possibility to study idiosyncratic genomic regions and to identify potential drug targets for targeted treatments.
In this study, we have introduced a culture-independent technique for characterizing airway pathogens in the chronically inflamed lung, using routine non-invasive sputum sampling coupled to unbiased WGS sequencing. We chose not to subject the sputum samples to any particle-size selection or human host DNA depletion, although protocols specifically suitable for sputum samples have been developed [16, 62]. This means that our sequence reads will include not only a large number of human genome-derived sequences but also extracellular DNA from microbial cells that might no longer be viable. On the other hand, the simplified processing means that experimental biases are minimized, and most DNA-containing pathogens should be accessible (particularly given the ongoing increases in sequencing throughput). Our approach should provide valuable information complementing routine culture-based clinical microbiology results in the future. This could help to design tailored treatment regimes by reducing the risk of ineffective treatment.
We find, firstly, that WGS sequencing can serve to describe the broad taxonomic composition of the lung microbiome, particularly when combined with reference databases. Reference information is still incomplete and can bias the results, but for human-associated microbes, it is growing at a remarkable pace [63, 64] and should make WGS-based taxonomic classification ever more accurate in the future.
Secondly, various binning approaches  can be used to partially assemble genomes of interest from the WGS data. In the case of chronic infections originating from one or a few clonal invasions, we find that contig-by-contig entropy is a good measure for isolating pathogens from contaminants and from the complex colonizing background of microbes. Such an approach may help clinicians to more precisely identify the disease-causing pathogen.
Thirdly, for those pathogens that have already been well studied in molecular terms, we demonstrated that MLST can be applied directly using WGS data. This allows the exact strain identity to be established, leveraging the power of MLST databases and cataloged strain observations. Importantly, this can significantly simplify the process of tracing infection outbreaks at clinics using untargeted, retrospective data.
Lastly, the partially assembled genomes and the remaining unassembled contigs could be informative with regard to the expected efficacy of treatment options. Antibiotics susceptibility testing is an important part of clinical management, although its efficacy has been questioned [66, 67]. We find that antibiotic resistances can be predicted from the metagenome, but while this is substantially supported by confirmatory clinical tests, it is not yet entirely error-free. This is likely to improve with better data curation in the resistance databases, but the most challenging resistances to predict correctly will be those that arise from specific mutations in normal, cellular genes . Importantly, aside from predicting resistances, WGS data may guide the decision as to which antibiotics to include in clinical testing in the first place, particularly for the second-generation antibiotics designed to counteract or circumvent known resistances.
Overall, WGS sequencing of sputum may become one of the building blocks supporting the advancement of a more personalized medicine. It yields not only deep insights into the lung microbiome by allowing an unbiased metagenomic dissection of microbial pathogens but also enables analysis of human genomic DNA for host genotyping (e.g., for host susceptibility to infection or for unexpected treatment responses). For routine tracking, deep WGS sequencing could be alternated with more shallow survey sequencing or 16S sequencing; the latter are likely sufficient to quantify changes in community composition, with deep sequencing only necessary when the new pathogens invade.
Sputum samples and DNA extraction
Sputum was either produced spontaneously (in the case of CF and COPD patients) or after induction by hypertonic saline nebulization (in the case of healthy control subjects). The sampling was conducted at the Cantonal Hospital St. Gallen and at the Children’s Hospital of Eastern Switzerland. Healthy control subjects were free of symptoms of respiratory discomfort and did not show overt infections. All study participants provided informed consent. The study was approved by the Cantonal Ethics Committee, St. Gallen (EKSG 13/112). The sputum samples were weighed and aliquoted into sterile tubes. After dilution in Sputolysin (Calbiochem Corp., San Diego, CA, USA), total DNA was extracted using the High Pure PCR Template Preparation Kit (Roche, Basel, Switzerland) according to manufacturer’s instructions. DNA concentration was measured using the ACTgene UV99 spectro-photometer at a wavelength of 260 nm, and samples were stored at −20 °C. As the starting material was not limiting, and sufficient amounts of DNA were available, no extra amplification step was deemed necessary and no extraction blanks for PCR/sequencing contamination control were processed.
Whole-genome shotgun sequencing
The TruSeq DNA Sample Prep Kit v2 (Illumina Inc., California, USA) was used for library generation. The quality and quantity of the enriched libraries were validated using a Qubit® (1.0) fluorometer and the Caliper GX LabChip® GX (Caliper Life Sciences, Inc., USA). The libraries were normalized to 10 nM in Tris-Cl 10 mM, pH 8.5 with 0.1% Tween 20. The TruSeq PE Cluster Kit v3-cBot-HS (Illumina, Inc., California, USA) was used for cluster generation using 2 pM of pooled normalized libraries on the cBOT. Paired-end sequencing was performed on the Illumina HiSeq 2000 at 2 X101 bp using the TruSeq SBS Kit v3-HS (Illumina, Inc., California, USA). Reads were quality-checked with FastQC .
Removal of the host genome and assembly
We used Bowtie2  to align the paired-end reads against the human reference genome, assembly Hg19 . Read pairs were omitted from any further downstream analysis if one or both mates from the pair aligned to the human genome. We assembled contigs from the remaining read pairs using the SPAdes assembler  under the “only assembly” setting.
Taxonomic annotation and diversity estimation
We searched the contigs against the NCBI nucleotide database (as of June 2014) using Blastn  with an e value cutoff of e −15. The most common recent ancestor of all genomic sequences that aligned to a given contig with a bit score within 10% range of the highest scoring alignment was used to taxonomically annotate the contig. This procedure largely excluded the analysis of phages and viruses because these tend to be poorly represented in databases and taxonomies. We independently cross-checked our microbial assignments by running MiDAS, a strain-level taxonomy analysis tool based on a large collection human-associated microbial pathogens . As shown in Additional file 7: Table S6, there is a good agreement between both approaches for the more abundant genera, while some low-abundant genera are missed by MiDAS (which restricts its search space to a set of suitable gene families). We discarded all contigs with annotations belonging to the Metazoan kingdom, to further remove host-genome sequences from further downstream analysis. We used nucleotide counts from assembled contigs with genus level taxonomic assignments to calculate Shannon entropy as a measure of diversity. We used Mann-Whitney U test for significance testing and subsequently adjusted the p values using the Bonferroni correction for multiple testing. To assess SNPs in the extracted pathogens, the FreeBayes tool was used (version 1.1.0, http://arxiv.org/abs/1207.3907). Fastq read pairs used for contig assembly were mapped back to contigs using bwa, version bwakit-0.7.12, under standard paired-end setting. Next, FreeBayes was used to call variants. Ploidy was set to 10 to predict variants present at differing frequencies, and the minimum read count for supporting alternate alleles was set to 3. The predicted variants were further filtered with the help of VcfFilter (available with the FreeBayes package), using a minimum quality score of 30 and minimum total read depth of 6 reads per position.
Contig binning via “entropy landscapes”
We recruited all paired-end reads that had contributed to the assembly process back against the non-human contigs using Bowtie2. This recruitment was used to calculate the average number of mismatches and gaps over the length of the contig, per base pair (entropy). This score was depicted on the z-axis of the plots, together with contig length on the x-axis, and GC content on the y-axis. The axes “entropy” and “GC content” are intrinsically normalized, i.e., largely independent of the number of reads per sample (or per contig). In contrast, the axis “contig size” is not normalized. Contigs from genera constituting more than 5% of the annotated non-human contigs were color-coded according to their annotation. We depicted the unannotated and the remaining contigs from the low abundance genera in a single color. The code for this analysis is available at https://github.com/rfeigelman/Microbe-Entropy-Analysis.git.
MLST and phylogenetic placement of abundant clonal species
We constructed a maximum likelihood tree for S. maltophilia using RAxML  under the GTRCAT model, with 1000 bootstraps, using the concatenated sequence composed of the seven housekeeping genes atpD, gapA, guaA, mutM, nuoD, ppsA, and recA. In patient CF-00, these genes showed 100% identity over their entire length to a previously observed strain. All the sequences of the previously typed strains used for building the tree were downloaded from http://pubmlst.org/smaltophilia/. The tree was rooted using Xanthomonas Campestris 8004 [GenBank accession NC_007086.1] as an outgroup. For the Achromobacter genus, we built another phylogenetic tree using the concatenated sequences of the seven housekeeping genes eno, gltB, lepA, nrdA, nuoL, nusA, and rpoB. Full-length sequences of all the seven genes were recovered from the assembled contigs of the patient CF-85. There was no exact match found to any previously documented strain. The sequences of previously typed strains used for building the tree were downloaded from http://pubmlst.org/achromobacter/. The tree was rooted using Bordetella pertussis [GenBank accession LN849008] as an outgroup.
Identification of antibiotic resistance genes and cassettes
We used the Comprehensive Antibiotic Resistance Database  to search for resistance-conferring genes in our samples. The results from the web server were filtered to include only matches with over 90 amino acid length and over 90% identity. We further compared these findings with the clinical laboratory reports on observed antibiotic resistances.
We used Mauve  for ordering the assembled Stenotrophomonas contigs from the patient sample CF-00 and subsequently performing whole-genome alignments against reference strains.
Cystic fibrosis transmembrane regulator
Chronic obstructive pulmonary disease
Multi-locus sequence typing
Polymerase chain reaction
Whole-genome shotgun sequencing
We thank João F. Matias Rodrigues for productive discussions, and Thomas S. Schmidt and Justin Feigelman for helpful suggestions on the manuscript. Catharine A. Fournier and Hubert Rehrauer are gratefully acknowledged for library preparation, sequencing, and initial data processing.
RF and CVM were funded by the Swiss National Science Foundation (grant #31003A_160095).
Availability of data and materials
The datasets supporting the conclusions of this article are available in the NCBI SRA repository, BioProject identifier PRJNA316588.
RF, CRK, MHB, and CVM conceived and planned the study. CRK, FB, FR, PK, RLK, and MHB selected suitable patients, obtained informed consent, and processed the samples. RF performed the bioinformatics analysis. RF and CVM wrote the manuscript, with the help from all co-authors. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Written informed consent was obtained from all study participants.
Ethics approval and consent to participate
The study was approved by the Cantonal Ethics Committee, St. Gallen (EKSG 13/112), and all study participants provided informed consent.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Walters S, Mehta A. Epidemiology of cystic fibrosis. In: Hodson M, Geddes DM, Bush A, editors. Cystic fibrosis, 3rd edn. London: Edward Arnold Ltd; 2007. p. 21–45.
- Boucher RC. An overview of the pathogenesis of cystic fibrosis lung disease. Adv Drug Deliv Rev. 2002;54(11):1359–71.View ArticlePubMedGoogle Scholar
- Mahenthiralingam E. Emerging cystic fibrosis pathogens and the microbiome. Paediatr Respir Rev. 2014;15:13–5.PubMedGoogle Scholar
- Bell SC, De Boeck K, Amaral MD. New pharmacological approaches for cystic fibrosis: promises, progress, pitfalls. Pharmacol Ther. 2015;145:19–34.View ArticlePubMedGoogle Scholar
- Laura GAO, Filkins M. Cystic fibrosis lung infections: polymicrobial, complex, and hard to treat. 2015. p. 1–8.Google Scholar
- Tan K, Conway SP, Brownlee KG, Etherington C, Peckham DG. Alcaligenes infection in cystic fibrosis. Pediatr Pulmonol. 2002;34(2):101–4.View ArticlePubMedGoogle Scholar
- Talmaciu I, Varlotta L, Mortensen J, Schidlow DV. Risk factors for emergence of Stenotrophomonas maltophilia in cystic fibrosis. Pediatr Pulmonol. 2000;30(1):10–5.View ArticlePubMedGoogle Scholar
- Zhou J, Garber E, Desai M, Saiman L. Compliance of clinical microbiology laboratories in the United States with current recommendations for processing respiratory tract specimens from patients with cystic fibrosis. J Clin Microbiol. 2006;44(4):1547–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Surette MG. The cystic fibrosis lung microbiome. Ann Am Thorac Soc. 2014;11:S61–5.View ArticlePubMedGoogle Scholar
- Chmiel JF, Aksamit TR, Chotirmall SH, Dasenbrook EC, Elborn JS, LiPuma JJ, Ranganathan SC, Waters VJ, Ratjen FA. Antibiotic management of lung infections in cystic fibrosis. I. The microbiome, methicillin-resistant Staphylococcus aureus, gram-negative bacteria, and multiple infections. Ann Am Thorac Soc. 2014;11(7):1120–9.View ArticlePubMedGoogle Scholar
- Dickson RP, Erb-Downward JR, Martinez FJ, Huffnagle GB. The microbiome and the respiratory tract. Annu Rev Physiol. 2016;78:481–504.View ArticlePubMedGoogle Scholar
- Coburn B, Wang PW, Diaz Caballero J, Clark ST, Brahma V, Donaldson S, Zhang Y, Surendra A, Gong Y, Elizabeth Tullis D, Yau YCW, Waters VJ, Hwang DM, Guttman DS. Lung microbiota across age and disease stage in cystic fibrosis. Sci Rep. 2015;5:10241.View ArticlePubMedPubMed CentralGoogle Scholar
- Whiteson KL, Bailey B, Bergkessel M, Conrad D, Delhaes L, Felts B, Harris JK, Hunter R, Lim YW, Maughan H, Quinn R, Salamon P, Sullivan J, Wagner BD, Rainey PB. The upper respiratory tract as a microbial source for pulmonary infections in cystic fibrosis. Parallels from island biogeography. Am J Respir Crit Care Med. 2014;189(11):1309–15.View ArticlePubMedPubMed CentralGoogle Scholar
- Moran Losada P, Chouvarine P, Dorda M, Hedtfeld S, Mielke S, Schulz A, Wiehlmann L, Tummler B. The cystic fibrosis lower airways microbial metagenome. ERJ Open Res. 2016;2(2):00096. 2015–96–2015.View ArticlePubMedPubMed CentralGoogle Scholar
- Quinn RA, Lim YW, Maughan H, Conrad D, Rohwer F, Whiteson KL. Biogeochemical forces shape the composition and physiology of polymicrobial communities in the cystic fibrosis lung. MBio. 2014;5(2):e00956. 13–e00956–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Lim YW, Haynes M, Furlan M, Robertson CE, Harris JK, Rohwer F. Purifying the impure: sequencing metagenomes and metatranscriptomes from complex animal-associated samples. J Vis Exp. 2014;94:e52117.Google Scholar
- Lim YW, Schmieder R, Haynes M, Willner D, Furlan M, Youle M, Abbott K, Edwards R, Evangelista J, Conrad D, Rohwer F. Metagenomics and metatranscriptomics: windows on CF-associated viral and microbial communities. J Cyst Fibros. 2013;12(2):154–64.View ArticlePubMedGoogle Scholar
- Lim YW, Evangelista JS, Schmieder R, Bailey B, Haynes M, Furlan M, Maughan H, Edwards R, Rohwer F, Conrad D, Forbes BA. Clinical insights from metagenomic analysis of sputum samples from patients with cystic fibrosis. J Clin Microbiol. 2014;52(2):425–37.View ArticlePubMedPubMed CentralGoogle Scholar
- Smith AL, Redding G, Doershuk C, Goldmann D, Gore E, Hilman B, Marks M, Moss R, Ramsey B, Roblo T, Schwartz RH, Thomassen MJ, Williams-Warren J, Weber A, Wilmott RW, Wilson HD, Yogev R. Sputum changes associated with therapy for endobronchial exacerbation in cystic fibrosis. J Pediatr. 1988;112(4):547–54.View ArticlePubMedGoogle Scholar
- Dhooghe B, Noël S, Huaux F, Leal T. Lung inflammation in cystic fibrosis: pathogenesis and novel therapies. Clin Biochem. 2014;47(7):539–46.View ArticlePubMedGoogle Scholar
- Charlson ES, Bittinger K, Chen J, Diamond JM, Li H, Collman RG, Bushman FD. Assessing bacterial populations in the lung by replicate analysis of samples from the upper and lower respiratory tracts. PLoS ONE. 2012;7(9):e42786.View ArticlePubMedPubMed CentralGoogle Scholar
- Charlson ES, Bittinger K, Haas AR, Fitzgerald AS, Frank I, Yadav A, Bushman FD, Collman RG. Topographical continuity of bacterial populations in the healthy human respiratory tract. Am J Respir Crit Care Med. 2011;184(8):957–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Bassis CM, Erb-Downward JR, Dickson RP, Freeman CM, Schmidt TM, Young VB, Beck JM, Curtis JL, Huffnagle GB. Analysis of the upper respiratory tract microbiotas as the source of the lung and gastric microbiotas in healthy individuals. MBio. 2015;6(2):e00037.View ArticlePubMedPubMed CentralGoogle Scholar
- Venkataraman A, Bassis CM, Beck JM, Young VB, Curtis JL, Huffnagle GB, Schmidt TM. Application of a neutral community model to assess structuring of the human lung microbiome. MBio. 2015;6(1):e02284–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Costello A, Reen FJ, O'Gara F, Callaghan M, McClean S. Inhibition of co-colonizing cystic fibrosis-associated pathogens by Pseudomonas aeruginosa and Burkholderia multivorans. Microbiology (Reading, Engl). 2014;160(7):1474–87.View ArticleGoogle Scholar
- Zhao J, Schloss PD, Kalikin LM, Carmody LA, Foster BK, Petrosino JF, Cavalcoli JD, VanDevanter DR, Murray S, Li JZ, Young VB, LiPuma JJ. Decade-long bacterial community dynamics in cystic fibrosis airways. Proc Natl Acad Sci U S A. 2012;109(15):5809–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Carmody LA, Zhao J, Schloss PD, Petrosino JF, Murray S, Young VB, Li JZ, LiPuma JJ. Changes in cystic fibrosis airway microbiota at pulmonary exacerbation. Ann Am Thorac Soc. 2013;10(3):179–87.View ArticlePubMedPubMed CentralGoogle Scholar
- Brown PS, Pope CE, Marsh RL, Qin X, McNamara S, Gibson R, Burns JL, Deutsch G, Hoffman LR. Directly sampling the lung of a young child with cystic fibrosis reveals diverse microbiota. Ann Am Thorac Soc. 2014;11(7):1049–55.View ArticlePubMedPubMed CentralGoogle Scholar
- Fodor AA, Klem ER, Gilpin DF, Elborn JS, Boucher RC, Tunney MM, Wolfgang MC. The adult cystic fibrosis airway microbiota is stable over time and infection type, and highly resilient to antibiotic treatment of exacerbations. PLoS ONE. 2012;7(9):e45001.View ArticlePubMedPubMed CentralGoogle Scholar
- Tunney MM, Klem ER, Fodor AA, Gilpin DF, Moriarty TF, McGrath SJ, Muhlebach MS, Boucher RC, Cardwell C, Doering G, Elborn JS, Wolfgang MC. Use of culture and molecular analysis to determine the effect of antibiotic treatment on microbial community diversity and abundance during exacerbation in patients with cystic fibrosis. Thorax. 2011;66(7):579–84.View ArticlePubMedGoogle Scholar
- Cramer N, Wiehlmann L, Tümmler B. Clonal epidemiology of Pseudomonas aeruginosa in cystic fibrosis. Int J Med Microbiol. 2010;300(8):526–33.View ArticlePubMedGoogle Scholar
- Coutinho CP, Dos Santos SC, Madeira A, Mira NP, Moreira AS, Sá-Correia I. Long-term colonization of the cystic fibrosis lung by Burkholderia cepacia complex bacteria: epidemiology, clonal variation, and genome-wide expression alterations. Front Cell Infect Microbiol. 2011;1:12.View ArticlePubMedPubMed CentralGoogle Scholar
- Maiden MCJ, van Rensburg MJ, Bray JE, Earle SG, Ford SA, Jolley KA, McCarthy ND. MLST revisited: the gene-by-gene approach to bacterial genomics. Nat Rev Microbiol. 2013;11(10):728–36.View ArticlePubMedPubMed CentralGoogle Scholar
- Brooke JS. Stenotrophomonas maltophilia: an emerging global opportunistic pathogen. Clin Microbiol Rev. 2012;25(1):2–41.View ArticlePubMedPubMed CentralGoogle Scholar
- Martínez JL. Antibiotics and antibiotic resistance genes in natural environments. Science. 2008;321(5887):365–7.View ArticlePubMedGoogle Scholar
- Gherardi G, Creti R, Pompilio A, Di Bonaventura G. Diagnostic microbiology and infectious disease. Diagn Microbiol Infect Dis. 2015;81(3):219–26.View ArticlePubMedGoogle Scholar
- Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25(7):1043–55.View ArticlePubMedPubMed CentralGoogle Scholar
- Hansen CR, Pressler T, Nielsen KG, Jensen PO, Bjarnsholt T, Hoiby N. Inflammation in Achromobacter xylosoxidans infected cystic fibrosis patients. J Cyst Fibros. 2010;9(1):51–8.View ArticlePubMedGoogle Scholar
- Amoureux L, Bador J, Fardeheb S, Mabille C, Couchot C, Massip C, Salignon AL, Berlie G, Varin V, Neuwirth C. Detection of Achromobacter xylosoxidans in hospital, domestic, and outdoor environmental samples and comparison with human clinical isolates. Appl Environ Microbiol. 2013;79(23):7142–9.View ArticlePubMedPubMed CentralGoogle Scholar
- De Baets F, Schelstraete P, Van Daele S, Haerynck F, Vaneechoutte M. Achromobacter xylosoxidans in cystic fibrosis: prevalence and clinical relevance. J Cyst Fibros. 2007;6(1):75–8.View ArticlePubMedGoogle Scholar
- Rønne Hansen C, Pressler T, Høiby N, Gormsen M. Chronic infection with Achromobacter xylosoxidans in cystic fibrosis patients; a retrospective case control study. J Cyst Fibros. 2006;5(4):245–51.View ArticlePubMedGoogle Scholar
- Spilker T, Vandamme P, LiPuma JJ. Identification and distribution of Achromobacter species in cystic fibrosis. J Cyst Fibros. 2013;12(3):298–301.View ArticlePubMedGoogle Scholar
- Jolley KA, Maiden MCJ. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics. 2010;11:595.View ArticlePubMedPubMed CentralGoogle Scholar
- Ridderberg W, Wang M, Norskov-Lauritsen N. Multilocus sequence analysis of isolates of Achromobacter from patients with cystic fibrosis reveals infecting species other than Achromobacter xylosoxidans. J Clin Microbiol. 2012;50(8):2688–94.View ArticlePubMedPubMed CentralGoogle Scholar
- Cystic Fibrosis Foundation. Cystic fibrosis patient registry 2014 annual data report. 2015. p. 1–92.Google Scholar
- Yang L, Jelsbak L, Marvig RL, Damkiær S, Workman CT, Rau MH, Hansen SK, Folkesson A, Johansen HK, Ciofu O, Høiby N, Sommer MOA, Molin S. Evolutionary dynamics of bacteria in a human host environment. 2011.Google Scholar
- Peschel A, Jack RW, Otto M, Collins LV, Staubitz P, Nicholson G, Kalbacher H, Nieuwenhuizen WF, Jung G, Tarkowski A, van Kessel KP, van Strijp JA. Staphylococcus aureus resistance to human defensins and evasion of neutrophil killing via the novel virulence factor MprF is based on modification of membrane lipids with l-lysine. J Exp Med. 2001;193(9):1067–76.View ArticlePubMedPubMed CentralGoogle Scholar
- Lieberman TD, Flett KB, Yelin I, Martin TR, McAdam AJ, Priebe GP, Kishony R. Genetic variation of a bacterial pathogen within individuals with cystic fibrosis provides a record of selective pressures. Nat Genet. 2014;46(1):82–7.View ArticlePubMedGoogle Scholar
- Diaz Caballero J, Clark ST, Coburn B, Zhang Y, Wang PW, Donaldson SL, Tullis DE, Yau YCW, Waters VJ, Hwang DM, Guttman DS. Selective sweeps and parallel pathoadaptation drive pseudomonas aeruginosa evolution in the cystic fibrosis lung. MBio. 2015;6(5):e00981–15.View ArticlePubMedPubMed CentralGoogle Scholar
- Marvig RL, Sommer LM, Molin S, Johansen HK. Convergent evolution and adaptation of Pseudomonas aeruginosa within patients with cystic fibrosis. Nat Genet. 2014;47(1):57–64.View ArticlePubMedGoogle Scholar
- Pulcrano G, Iula DV, Raia V, Rossano F, Catania MR. Different mutations in mucA gene of Pseudomonas aeruginosa mucoid strains in cystic fibrosis patients and their effect on algU gene expression. New Microbiol. 2012;35(3):295–305.PubMedGoogle Scholar
- Li Z, Kosorok MR, Farrell PM, Laxova A, West SEH, Green CG, Collins J, Rock MJ, Splaingard ML. Longitudinal development of mucoid Pseudomonas aeruginosa infection and lung disease progression in children with cystic fibrosis. JAMA. 2005;293(5):581–8.View ArticlePubMedGoogle Scholar
- Pukatzki S, Ma AT, Revel AT, Sturtevant D, Mekalanos JJ. Type VI secretion system translocates a phage tail spike-like protein into target cells where it cross-links actin. Proc Natl Acad Sci U S A. 2007;104(39):15508–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Hood RD, Singh P, Hsu F, Güvener T, Carl MA, Trinidad RRS, Silverman JM, Ohlson BB, Hicks KG, Plemel RL, Li M, Schwarz S, Wang WY, Merz AJ, Goodlett DR, Mougous JD. A type VI secretion system of Pseudomonas aeruginosa targets a toxin to bacteria. Cell Host and Microbe. 2010;7(1):25–37.View ArticlePubMedPubMed CentralGoogle Scholar
- Bingle LE, Bailey CM, Pallen MJ. Type VI secretion: a beginner’s guide. Curr Opin Microbiol. 2008;11(1):3–8.View ArticlePubMedGoogle Scholar
- Boyer F, Fichant G, Berthod J, Vandenbrouck Y, Attree I. Dissecting the bacterial type VI secretion system by a genome wide in silico analysis: what can be learned from available microbial genomic resources? BMC Genomics. 2009;10(1):1–14.View ArticleGoogle Scholar
- Ryan RP, Monchy S, Cardinale M, Taghavi S, Crossman L, Avison MB, Berg G, van der Lelie D, Dow JM. The versatility and adaptation of bacteria from the genus Stenotrophomonas. Nat Rev Microbiol. 2009;7(7):514–25.View ArticlePubMedGoogle Scholar
- Ormerod KL, George NM, Fraser JA, Wainwright C, Hugenholtz P. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients. Peer J. 2015;3(23):e1223.View ArticlePubMedPubMed CentralGoogle Scholar
- Alavi P, Starcher MR, Thallinger GG, Zachow C, ller HM, Berg G. Stenotrophomonas comparative genomics reveals genes and functions that differentiate beneficial and pathogenic bacteria. BMC Genomics. 2014;15(1):482.
- Adamek M, Linke B, Schwartz T. Virulence genes in clinical and environmental Stenotrophomas maltophilia isolates: a genome sequencing and gene expression approach. Microb Pathog. 2014;67(C):20–30.View ArticlePubMedGoogle Scholar
- Baron C. Antivirulence drugs to target bacterial secretion systems. Curr Opin Microbiol. 2010;13(1):100–5.View ArticlePubMedGoogle Scholar
- Rogers GB, Stressmann FA, Koller G, Daniels T, Carroll MP, Bruce KD. Assessing the diagnostic importance of nonviable bacterial cells in respiratory infections. Diagn Microbiol Infect Dis. 2008;62(2):133–41.View ArticlePubMedGoogle Scholar
- Human Microbiome Jumpstart Reference Strains Consortium, Nelson KE, Weinstock GM, Highlander SK, Worley KC, Creasy HH, Wortman JR, Rusch DB, Mitreva M, Sodergren E, Chinwalla AT, Feldgarden M, Gevers D, Haas BJ, Madupu R, Ward DV, Birren BW, Gibbs RA, Methé B, Petrosino JF, Strausberg RL, Sutton GG, White OR, Wilson RK, Durkin S, Giglio MG, Gujja S, Howarth C, Kodira CD, Kyrpides N, Mehta T, Muzny DM, Pearson M, Pepin K, Pati A, Qin X, Yandava C, Zeng Q, Zhang L, Berlin AM, Chen L, Hepburn TA, Johnson J, McCorrison J, Miller J, Minx P, Nusbaum C, Russ C, Sykes SM, Tomlinson CM, Young S, Warren WC, Badger J, Crabtree J, Markowitz VM, Orvis J, Cree A, Ferriera S, Fulton LL, Fulton RS, Gillis M, Hemphill LD, Joshi V, Kovar C, Torralba M, Wetterstrand KA, Abouellleil A, Wollam AM, Buhay CJ, Ding Y, Dugan S, FitzGerald MG, Holder M, Hostetler J, Clifton SW, Allen-Vercoe E, Earl AM, Farmer CN, Liolios K, Surette MG, Xu Q, Pohl C, Wilczek-Boney K, Zhu D. A catalog of reference genomes from the human microbiome. Science. 2010;328(5981):994–9.View ArticleGoogle Scholar
- Wylie KM, Truty RM, Sharpton TJ, Mihindukulasuriya KA, Zhou Y, Gao H, Sodergren E, Weinstock GM, Pollard KS. Novel bacterial taxa in the human microbiome. PLoS ONE. 2012;7(6):e35294.View ArticlePubMedPubMed CentralGoogle Scholar
- Chatterji S, Yamazaki I, Bai Z, Eisen JA. CompostBin: a DNA composition-based algorithm for binning environmental shotgun reads. Research in Computational Molecular Biology. 2008:17–28. doi:10.1007/978-3-540-78839-3_3.
- Aaron SD, Vandemheen KL, Ferris W, Fergusson D, Tullis E, Haase D, Berthiaume Y, Brown N, Wilcox P, Yozghatlian V, Bye P, Bell S, Chan F, Rose B, Jeanneret A, Stephenson A, Noseworthy M, Freitag A, Paterson N, Doucette S, Harbour C, Ruel M, MacDonald N. Combination antibiotic susceptibility testing to treat exacerbations of cystic fibrosis associated with multiresistant bacteria: a randomised, double-blind, controlled clinical trial. Lancet. 2005;366(9484):463–71.View ArticlePubMedGoogle Scholar
- Keays T, Ferris W, Vandemheen KL, Chan F, Yeung S-W, Mah T-F, Ramotar K, Saginur R, Aaron SD. A retrospective analysis of biofilm antibiotic susceptibility testing: a better predictor of clinical response in cystic fibrosis exacerbations. J Cyst Fibros. 2009;8(2):122–7.View ArticlePubMedGoogle Scholar
- Martinez JL, Baquero F, Andersson DI. Predicting antibiotic resistance. Nat Rev Microbiol. 2007;5(12):958–65.View ArticlePubMedGoogle Scholar
- Bioinformatics B. FastQC: a quality control tool for high throughput sequence data. Cambridge: Babraham Institute; 2011.Google Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Publ Group. 2012;9(4):357–9.Google Scholar
- Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 2013;41(1):D64–9.View ArticlePubMedGoogle Scholar
- Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77.View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.View ArticlePubMedGoogle Scholar
- Nayfach S, Rodriguez-Mueller B, Garud N, Pollard KS. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Res. 2016;26(11):1612–25.View ArticlePubMedPubMed CentralGoogle Scholar
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22(21):2688–90.View ArticlePubMedGoogle Scholar
- McArthur AG, Waglechner N, Nizam F, Yan A, Azad MA, Baylay AJ, Bhullar K, Canova MJ, De Pascale G, Ejim L, Kalan L, King AM, Koteva K, Morar M, Mulvey MR, O'Brien JS, Pawlowski AC, Piddock LJV, Spanogiannopoulos P, Sutherland AD, Tang I, Taylor PL, Thaker M, Wang W, Yan M, Yu T, Wright GD. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother. 2013;57(7):3348–57.View ArticlePubMedPubMed CentralGoogle Scholar
- Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14(7):1394–403.View ArticlePubMedPubMed CentralGoogle Scholar