In silico analyses of metagenomes from human atherosclerotic plaque samples
- Suparna Mitra1, 2, 3Email author,
- Daniela I. Drautz-Moses1,
- Morten Alhede5,
- Myat T. Maw1,
- Yang Liu1,
- Rikky W. Purbojati1,
- Zhei H. Yap1,
- Kavita K. Kushwaha1,
- Alexandra G. Gheorghe4,
- Thomas Bjarnsholt5, 7,
- Gorm M. Hansen5, 8,
- Henrik H. Sillesen4,
- Hans P. Hougen4,
- Peter R. Hansen8,
- Liang Yang1,
- Tim Tolker-Nielsen5,
- Stephan C. Schuster1 and
- Michael Givskov1, 5Email author
© Mitra et al. 2015
Received: 21 May 2015
Accepted: 12 August 2015
Published: 3 September 2015
Through several observational and mechanistic studies, microbial infection is known to promote cardiovascular disease. Direct infection of the vessel wall, along with the cardiovascular risk factors, is hypothesized to play a key role in the atherogenesis by promoting an inflammatory response leading to endothelial dysfunction and generating a proatherogenic and prothrombotic environment ultimately leading to clinical manifestations of cardiovascular disease, e.g., acute myocardial infarction or stroke. There are many reports of microbial DNA isolation and even a few studies of viable microbes isolated from human atherosclerotic vessels. However, high-resolution investigation of microbial infectious agents from human vessels that may contribute to atherosclerosis is very limited. In spite of the progress in recent sequencing technologies, analyzing host-associated metagenomes remain a challenge.
To investigate microbiome diversity within human atherosclerotic tissue samples, we employed high-throughput metagenomic analysis on: (1) atherosclerotic plaques obtained from a group of patients who underwent endarterectomy due to recent transient cerebral ischemia or stroke. (2) Presumed stabile atherosclerotic plaques obtained from autopsy from a control group of patients who all died from causes not related to cardiovascular disease. Our data provides evidence that suggest a wide range of microbial agents in atherosclerotic plaques, and an intriguing new observation that shows these microbiota displayed differences between symptomatic and asymptomatic plaques as judged from the taxonomic profiles in these two groups of patients. Additionally, functional annotations reveal significant differences in basic metabolic and disease pathway signatures between these groups.
We demonstrate the feasibility of novel high-resolution techniques aimed at identification and characterization of microbial genomes in human atherosclerotic tissue samples. Our analysis suggests that distinct groups of microbial agents might play different roles during the development of atherosclerotic plaques. These findings may serve as a reference point for future studies in this area of research.
Cardiovascular disease is the leading cause of death worldwide, and atherosclerosis, its primary cause, is a chronic inflammatory disease traditionally associated with risk factors such as male sex, age, smoking, hypertension, hyperlipidemia, obesity, and diabetes. However, elevated circulating levels of inflammatory markers, e.g., high-sensitivity C-reactive proteins are also risk markers for atherosclerotic disease and increasing evidence indicates that infections and chronic inflammatory diseases, e.g., rheumatoid arthritis, are also linked with increased risk of atherosclerosis [1–3]. Indeed, direct microbial infection in the vessel wall can promote inflammatory responses and generation of a proatherogenic and prothrombotic environment, while secreted bacterial products, e.g., lipopolysaccharide and heat shock proteins, can also indirectly induce autoimmunity and other immunoinflammatory mechanisms that may contribute to the atherosclerotic process [1, 2].
Specifically, a number of infectious agents have been suggested to promote atherosclerosis, e.g., Chlamydia pneumoniae, Helicobacter pylori, Hepatitis C virus, Pseudomonas aeruginosa, and Cytomegalovirus and oral bacteria such as Porphyromonas gingivalis, Aggregatibacter actinomycetemcomitans, and Prevotella intermedia involved in marginal periodontitis, have been associated with atherosclerotic disease, e.g., acute myocardial infarction [4, 5]. Indeed, previous studies have identified DNA from a broad variety of bacteria in atherosclerotic plaques . Bacterial 16S rDNA signatures from environmental microorganisms and several nosocomial pathogens and viruses were recently found in atherosclerotic lesions of patients with coronary heart disease [7, 8]. As compared to asymptomatic stable plaques, culprit lesions associated with atherosclerotic disease manifestations, e.g., myocardial infarction or stroke, are associated with increased inflammation and thrombosis, but differences in bacterial profiles of unstable vs. stable atherosclerotic plaques have not been investigated.
Even though the above pathogens have been reported to be present in plaques, there is a lack of published high-resolution investigations of dynamic composition of microbial infectious agents that may be associated with development of atherosclerosis. The traditional approach for identifying microbes from target sites such as culturing and 16S rRNA PCR sequencing are quite limited for characterization of numerous infectious agents [4, 6]. Recent progress in deep sequencing technology has provided another potential approach for high throughput and comparative analysis of the atherosclerosis-associated microbial agents. Also, metagenomics has the potential to characterize microbial communities in various sites of human body and give a deeper understanding of their impact on human physiology and diseases . However, many challenges still remain in applying metagenomics analysis of atherosclerotic plaque samples. For example, atherosclerotic plaques contain large amounts of human genome DNA which can reduce the microbial genome DNA reads during sequencing. Similar considerations apply for analyses of target tissue samples from other host inflammation-associated diseases like periodontal diseases, non-healing ulcers, or traumatic wounds.
Here, with the use of two different types of atherosclerotic tissue samples, symptomatic atherosclerotic plaques from patients with symptomatic atherosclerotic disease and asymptomatic atherosclerotic plaques from patients that have atherosclerotic tissues died from other causes than atherosclerosis, we performed a comparative metagenomic analysis of the contribution of microbial community to the development of atherosclerotic disease. The atherosclerotic microbial communities were analyzed by applying deep sequencing and a multistage approach using different methods to reduce human genomic reads and hereby achieve improved understanding of the bacterial communities potentially involved in human disease. We identified bacterial families of potential importance for eliciting the transition from asymptomatic to symptomatic atherosclerotic disease.
Sequencing and data processing
Sample statistics and read assignments
RapSearch processed (as in MEGAN)
Assigned in MEGAN (MSc:50 MSp25 MinCompl:0.44 and paired protocol)
977 + APD1
Additional file 1: Figure S1 provides a multiple comparison tree view using taxonomic annotation of all 12 samples at “family” level of NCBI taxonomy. In this figure, the samples from patients with symptomatic atherosclerosis (unstable plaques) are displayed in white and the control samples of stable lesions (are displayed in gray color).
Despite our efforts to clean up human gene sequences from the samples, we found that some of the human-like sequences remained detectable. This was probably caused by the high level of variation in our samples compared to the hg19 reference genome. Indeed, some human sequences can cause false hits to species which have similar sequence structure. Some of the species hits that are likely to be of human origin were those for Microbacterium laenivormans, Cyanothece, Coprobacillus, and Aster Yellows Phytoplasma which were therefore marked with black crosses in Additional file 1: Figure S1, and we discarded these four species before clustering (Fig. 1).
Functional analyses and comparison of total biome
MEGAN’s functional analyses using SEED and KEGG classification are shown in Additional file 2: Figure S2 and Additional file 3: Figure S3 at the second level of the SEED and KEGG hierarchy. From both figures, it is apparent that all of the involved metabolisms were driven mostly by samples 233 (violet) and 238 (red), and this might also be caused by higher sequencing depth for this two samples as described in the “Taxonomic annotation” section.
Atherosclerosis is a chronic inflammatory disease that is generally perceived to be driven by classical risk factors (smoking, diabetes, hypertension, etc.). However, evidence accumulated in the last 25 years suggests that infection can also play an important contributory role by direct and indirect mechanisms [1–3]. In the present study, we used deep sequencing and metagenomic analysis to provide a high-resolution investigation of microbial species in atherosclerotic plaques from stable asymptomatic lesions and unstable lesions removed at surgery. We provided details of a multi-step protocol to reduce human DNA reads in order to obtain a comprehensive picture of the microbial community present in the diseased arterial wall. A similar protocol can be applied to any host-associated metagenomics study.
To date, a wide range of infectious agents have been linked to atherosclerosis, and while H. pylori, Cytomegalovirus, Hepatitis C virus, and other species have also been implicated, the current weight of evidence has arguably favored the involvement of C. pneumoniae and periodontal organisms such as P. gingivalis [2–4]. However, these works were based on final point sampling of the symptomatic atherosclerotic plaques samples. The lack of asymptomatic atherosclerotic plaques samples limits our understanding of the roles of microbial agents for contributing to the development of the atherosclerotic diseases. Also, recent research has suggested that the contribution from an aggregate “infectious” burden induced by a large number of pathogens is much more important for atherosclerosis development than any single organism [3, 10, 11].
Our work provides the first comparative metagenomic analysis of atherosclerotic plagues from the symptomatic atherosclerotic disease patients and asymptomatic atherosclerotic samples from a control group of patients that have atherosclerotic tissues died from other causes than atherosclerosis. The asymptomatic atherosclerotic plaques have more abundance of host microbiome-associated microbial families such as Porphyromonadaceae, Bacteroidaceae, Micrococcaceae, and Streptococcaceae  than the symptomatic atherosclerotic plaques (Fig. 1). This result suggests that these host microbiome-associated microbial families are one of the first colonizers of the arteries. These early colonizers might support the growth of sulfur-consuming families such as sulfur-oxidizing symbionts and Thiotrichaceae  and pathogens such as Helicobacteraceae and Neisseriaceae, which were found to be abundant in the symptomatic atherosclerotic plaques (Fig. 1). It is well known that homocysteine, an intermediate in sulfur-containing amino acid methionine metabolism, is highly related to the vascular disease, and arteriosclerosis patients have elevated levels of homocysteine in their blood . The elevated levels of homocysteine might facilitate the thrive of sulfur-consuming families. The presence of pathogens might enhance the degeneration of elastic fibers and fragmentation of the internal elastic membrane by their elastase activity. Further study is required to investigate the potential interactions of commensal and pathogen populations during the transition from asymptomatic atherosclerotic plaques to symptomatic atherosclerotic plaques.
Some of the infectious agents that are most prevalent in our symptomatic atherosclerotic samples, e.g., Acinetobacter, Acidovorax, and N. polysaccharea have not been reported previously. FISH observation validated the presence of biofilm-like structures of these pathogens in the symptomatic atherosclerotic plague samples (Fig. 7). In addition, the presence of multiple pathogens in atherosclerotic plaques and their potential organization in biofilms may be part of the explanation underlying other pertinent findings in this area of research, e.g., absence of effect of antibiotic treatment on clinical endpoints in high-risk patients with coronary artery disease, and an apparent correlation between oral, gut, and atherosclerotic plaque microbiotas [5, 15, 16]. Indeed, a very recent study suggested the presence of P. aeruginosa biofilms within the plaque in patients with advanced atherosclerotic disease . In our study, however, we found only a very low number of Pseudomonas reads in four symptomatic atherosclerotic plague samples (average 0.08 % reads of all assigned bacteria) out of 15 unstable plaque samples and in one asymptomatic atherosclerotic plague samples sample (0.048 % reads) among seven plaques, respectively. These differences in observation of Pseudomonas abundance between their study and our study might due to the fact that we used deep shotgun sequencing approach, which might have helped to obtain greater picture of the microbial community. More studies are clearly needed to determine the putative identity, organization, and role of bacteria detected in atherosclerotic plaques.
The functional analyses reported in the current study represent an innovative method to investigate mechanisms of potential relevance to atherosclerosis. Interestingly, these results show basic (carbohydrate, amino acid, and energy metabolism) pathways, whereas potential pathways for causing cardiovascular diseases and infectious diseases are also identified. However, these results should be viewed as preliminary and hypothesis-generating, and more research is required to provide more insights into atherosclerosis pathobiology. In addition to this context, it is worth to mention that currently, we are working on the next phase of this project using transcriptomics. RNA will help us in better understanding of activity of these metabolisms.
Important limitations to this study include lack of true negative controls, i.e., arterial samples without atherosclerosis, and absence of histologic examinations to establish signs of plaque (in)stability. Although we found an abundance of microbial genetic material, we cannot conclude that these agents were viable within the atheromatous plaque. For example, bacterial DNA found within the sampled atheroma could represent DNA fragments from bacteria engulfed and killed elsewhere in the body by phagocytic cells that subsequently entered the atherosclerotic vessel segment. Study limitations also include risk of microbial contamination of tissue samples.
This paper describes a novel approach to apply high-throughput sequencing and whole-genome shotgun metagenomics on plaque tissue samples to investigate microbiome diversity of the plaque obtained from patients with a thrombotic event and controls with non-symptomatic plaques. Our data provides evidence that suggest a wide range of microbial agents in atherosclerotic plaques and an intriguing new observation that shows these microbiota displayed differences between symptomatic and asymptomatic plaques as judged from the taxonomic profiles in these two groups of patients. Additionally, functional annotations reveal significant differences in basic metabolic and disease pathway signatures between these groups.
Patients and samples
For this study, we used atherosclerotic tissue samples from a group of 15 patients that underwent elective carotid endarterectomy following repeated transient ischemic attacks or minor strokes (samples from symptomatic atherosclerotic plaques as cases (Table 1). Our methods and experimental manuals were approved by The National Committee on Health Research Ethics (Danish) and was granted by the Ethical Committee of the region of Copenhagen (H-3-2011-013). Further, we have asymptomatic atherosclerotic plaques from seven persons who died from causes not related to atherosclerotic disease (samples from stable plaques as controls; Table 1) that originated from the tissue bank at the Department of Forensic Medicine (Approval No. 1501230). The approvals do not allow us to collect and use patient/person data for the analysis. Consequently, after collection, all 22 samples were treated in a blinded fashion during lab process, sequencing, and data analyses.
Sample collection and DNA sequencing
Appropriate material from diseased arteries taken at sites of macroscopic atherothrombotic plaque was surgically dissected. Cross-sectional pieces of tissue 3–5-mm thick were prepared and immediately stored at −80 °C.
DNA was extracted using QIAGEN’s DNeasy Blood & Tissue kit, following the “Purification of Total DNA from Animal Tissues (Spin-Column)” protocol. Briefly, tissue was cut into tiny pieces of no more than 25 mg on a clean Petri dish on dry ice and then homogenized in ATL buffer with a 5-mm stainless steel bead (QIAGEN) on a Tissuelyser at 50 Hz for 2 min or until tissue was completely homogenized. After homogenization, proteinase K was added and the sample was incubated at 56 °C until it was completely lysed. The remaining steps of the protocol were performed according to QIAGEN’s recommendation. DNA was eluted with 100 μL of buffer AE.
Prior to library preparation, the quality of the DNA samples was assessed on a Bioanalyzer 2100, using a DNA 12000 Chip (Agilent). Sample quantitation was performed using Quant-iT™ PicoGreen ® dsDNA Reagent.
Next-generation sequencing library preparation was prepared by following Illumina’s TruSeq DNA Sample Preparation protocol. The samples were sheared on a Covaris S220 to ~500 bp, following the manufacturer’s recommendation. Size selection was performed on a Sage Science Pippin Prep instrument using a 1.5 % EtBr agarose cassette and selecting for a tight peak around 620 bp. Each library was uniquely tagged with one of Illumina’s TruSeq LT DNA barcodes to allow library pooling for sequencing.
Library quantitation was performed using Invitrogen’s PicoGreen assay and the average library size was determined by running the libraries on a Bioanalyzer DNA 7500 chip (Agilent). Library concentrations were normalized to 4 nM and validated by qPCR on a StepOne Plus real-time thermocycler (Applied Biosystems), using qPCR primers recommended in Illumina’s qPCR protocol, and Illumina’s PhiX control library as standard. Libraries were then pooled and sequenced in two lanes of an Illumina HiSeq2000 sequencing run at a read length of 101 bp paired-end. Table 1 provides details of sequencing batch info. Library 977 (sample P0613) was first sequenced on a MiSeq V1 run, but due to the low number of bacterial reads obtained, this sample was later also included in a HiSeq2500 rapid sequencing run at a read length of 101 bp paired-end to achieve a comparable number of bacterial reads. We subsequently merged the reads from both sequencing run for this sample and treated these as one sample during data processing and comparisons.
As arterial plaque samples represent a host-associated metagenome, we mapped these reads against human reference genome (hg19) using bowtie 2-2.0.0  with “very-sensitive” parameters to filter all human-like sequences from our samples. All unmapped reads (non-hg19) were extracted and aligned against non-redundant (nr) protein database (version 30.07.2012)  using BLASTX (ncbi-blast-2.2.25+; Max e-value 10e-3) . After performing the BLASTX alignment, all output files of paired read sequences were imported and analyzed using the paired-end protocol of MEGAN5 .
For processing the BLAST files by MEGAN5, we used parameter settings of “Min Score = 50”, “Top Percent = 10”, “Min Support = 25,” and “Minimum sequence complexity threshold = 0.44”. Some reads which did not have any match to the respective database were placed under a “No hit” node, and some reads that were originally assigned to a taxon that did not meet our selected threshold criterion were pushed back using the lowest common ancestor (LCA) algorithm to higher nodes where the threshold was met. After importing datasets in MEGAN, we obtained MEGAN-own “rma files” for each data mapped onto NCBI taxonomy based on our selected threshold. Further, only reads annotated as bacterial reads were extracted in new documents for each of the 22 samples and used for later analyses.
Multiple metagenome comparison
All “rma files” were normalized to the smallest data set size without the not-assigned reads to allow inter-comparison of taxonomic abundances and to obtain comparative tree view for all samples. Additionally, “family” level taxonomic profile was used for hierarchical clustering with average linkage where Pearson correlation was used for clustering the families (rows) and Spearman correlation was used for clustering the datasets (columns) (Fig. 1). All computations were performed using R 3.0.2 . Also, we performed principal coordinates analysis (PCoA) and unweighted pair group method with arithmetic mean (UPGMA) hierarchical clustering to compare taxonomic profiles of the samples at “species” level of NCBI taxonomy using MEGAN5 (Fig. 3).
Rarefaction and diversity indices
Total assigned bacterial species
Comparison of total biome: cases vs. controls
In addition to the above analyses, we merged the samples from the two patient groups, i.e., 15 symptomatic atherosclerosis patient samples (cases, Table 1) and the asymptomatic atherosclerotic samples from seven persons who died from causes not related to atherosclerotic disease (control, Table 1), to obtain a profile for the merged datasets in agreement with the view that up to the point of plaque instability, atherosclerosis pathogenesis is similar in symptomatic (unstable), and asymptomatic (stable) lesions. Only three samples (233, 238, and P0613) were not included in the merged samples (see the “Results” section).
Additionally, a functional analysis was performed on all samples using the SEED classification , based on the given BLASTX alignment and for only bacterial reads, using MEGAN5 as previously described . In this classification scheme, genes are assigned to functional roles and genes with different functional roles are grouped into subsystems. The SEED classification can be represented as a rooted tree in which internal nodes represent different subsystems and where leaves represent functional roles.
To obtain a tentative pathway analysis, we performed an analysis based on Kyoto Encyclopedia for Genes and Genomes (KEGG) , where bacterial reads were mapped onto KEGG orthologous groups using MEGAN5. For such analysis, the MEGAN program matched each read to a KEGG orthology (KO) accession number, using the best hit to a reference sequence for which a KO accession number is known. The program reported the number of hits to each KEGG pathway. For such functional annotation using both SEED and KEGG, we first extracted all the reads that were mapped to bacteria in taxonomic annotations for all the datasets. Subsequently, we only processed these reads to obtain SEED and KEGG classifications to investigate metabolic or disease pathways of potential relevance to atherosclerosis
Fluorescence in situ hybridization
Fluorescence in situ hybridization (FISH) probe information
Clarithromycin-resistant Helicobacter pylori 23S rRNA
[cy5] ACC TCT CTC GAA CTC CAG
[cy5] GTC CCA GTC TGG CTG ATC
Subdivision 1 of candidate division TM7
The sequence data obtained in this study have been deposited in the NCBI database under BioProject/BioSamples with accession number SRP040611.
This research was supported by a grant from the Villum foundation to MG, National Research Foundation and Ministry of Education Singapore under its Research Centre of Excellence Program.
Supplementary information is available at The ISME Journal’s website.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Leinonen M, Saikku P. Evidence for infectious agents in cardiovascular disease and atherosclerosis. Lancet Infect Dis. 2002;2:11–7.View ArticlePubMedGoogle Scholar
- Libby P, Egan D, Skarlatos S. Roles of infectious agents in atherosclerosis and restenosis: an assessment of the evidence and need for future research. Circulation. 1997;96:4095–103.View ArticlePubMedGoogle Scholar
- Joshi R, Khandelwal B, Joshi D, Gupta OP. Chlamydophila pneumoniae infection and cardiovascular disease. N Am J Med Sci. 2013;5:169–81.PubMed CentralView ArticlePubMedGoogle Scholar
- Fiehn NE, Larsen T, Christiansen N, Holmstrup P, Schroeder TV. Identification of periodontal pathogens in atherosclerotic vessels. J Periodontol. 2005;76:731–6.View ArticlePubMedGoogle Scholar
- Lanter BB, Sauer K, Davies DG. Bacteria present in carotid arterial plaques are found as biofilm deposits which may contribute to enhanced risk of plaque rupture. MBio. 2014;5:e01206–14.PubMed CentralView ArticlePubMedGoogle Scholar
- Apfalter P, Blasi F, Boman J, Gaydos CA, Kundi M, Maass M, et al. Multicenter comparison trial of DNA extraction methods and PCR assays for detection of Chlamydia pneumoniae in endarterectomy specimens. J Clin Microbiol. 2001;39:519–24.PubMed CentralView ArticlePubMedGoogle Scholar
- Ott SJ, El Mokhtari NE, Musfeldt M, Hellmig S, Freitag S, Rehman A, et al. Detection of diverse bacterial signatures in atherosclerotic lesions of patients with coronary heart disease. Circulation. 2006;113:929–37.View ArticlePubMedGoogle Scholar
- Watt S, Aesch B, Lanotte P, Tranquart F, Quentin R. Viral and bacterial DNA in carotid atherosclerotic lesions. Eur J Clin Microbiol Infect Dis. 2003;22:99–105.PubMedGoogle Scholar
- Song S, Jarvie T, Hattori M. Our second genome-human metagenome: how next-generation sequencer changes our life through microbiology. Adv Microb Physiol. 2013;62:119–44.View ArticlePubMedGoogle Scholar
- Elkind MS. Infectious burden: a new risk factor and treatment target for atherosclerosis. Infect Disord Drug Targets. 2010;10:84–90.PubMed CentralView ArticlePubMedGoogle Scholar
- Tufano A, Di Capua M, Coppola A, Conca P, Cimino E, Cerbone AM, et al. The infectious burden in atherothrombosis. Semin Thromb Hemost. 2012;38:515–23.View ArticlePubMedGoogle Scholar
- Chen W, Liu F, Ling Z, Tong X, Xiang C. Human intestinal lumen and mucosa-associated microbiota in patients with colorectal cancer. PLoS One. 2012;7, e39743.PubMed CentralView ArticlePubMedGoogle Scholar
- Grunke S, Lichtschlag A, de Beer D, Kuypers M, Losekann-Behrens T, Ramette A, et al. Novel observations of Thiobacterium, a sulfur-storing Gammaproteobacterium producing gelatinous mats. The ISME journal. 2010;4:1031–43.View ArticlePubMedGoogle Scholar
- Varga EA, Sturm AC, Misita CP, Moll S. Cardiology patient pages. Homocysteine and MTHFR mutations: relation to thrombosis and coronary artery disease. Circulation. 2005;111:e289–93.View ArticlePubMedGoogle Scholar
- Gluud C, Als-Nielsen B, Damgaard M, Fischer Hansen J, Hansen S, Helo OH, et al. Clarithromycin for 2 weeks for stable coronary heart disease: 6-year follow-up of the CLARICOR randomized trial and updated meta-analysis of antibiotics for coronary heart disease. Cardiology. 2008;111:280–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Koren O, Spor A, Felin J, Fak F, Stombaugh J, Tremaroli V, et al. Human oral, gut, and plaque microbiota in patients with atherosclerosis. Proc Natl Acad Sci U S A. 2011;108 Suppl 1:4592–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res. 2005;33:D34–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.View ArticlePubMedGoogle Scholar
- Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC. Integrative analysis of environmental sequences using MEGAN4. Genome Res. 2011;21:1552–60.PubMed CentralView ArticlePubMedGoogle Scholar
- Team RC. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. 2013.
- Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, Cohoon M, et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 2005;33:5691–702.PubMed CentralView ArticlePubMedGoogle Scholar
- Mitra S, Rupek P, Richter DC, Urich T, Gilbert JA, Meyer F, et al. Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC bioinformatics. 2011;12 Suppl 1:S21.View ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30.PubMed CentralView ArticlePubMedGoogle Scholar
- Nielsen PH, Daims H, Lemmer H. FISH Handbook for Biological Wastewater Treatment. IWA Publishing Company, London 2009;ISBN.