Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI. The human microbiome project. Nature. 2007;449:804–10.
Article
CAS
Google Scholar
Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444:1027–31.
Article
Google Scholar
Muegge BD, Kuczynski J, Knights D, Clemente JC, Gonzalez A, Fontana L, et al. Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science. 2011;332:970–4.
Morgan XC, Tickle TL, Sokol H, Gevers D, Devaney KL, Ward DV, et al. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol. 2012;13:R79 Available from: http://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-9-r79.
Article
CAS
Google Scholar
Lee STM, Kahn SA, Delmont TO, Shaiber A, Esen özcan C, Hubert NA, et al. Tracking microbial colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics. Microbiome. 2017;5:1–10.
Article
Google Scholar
Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, et al. Functional metagenomic profiling of nine biomes. Nature. 2008;452:629–32.
Article
CAS
Google Scholar
Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, et al. Human gut microbiome viewed across age and geography. Nature. 2012;486:222–7.
Article
CAS
Google Scholar
Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL, et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol. 2012;8:e1002358.
Fierera N, Leff JW, Adams BJ, Nielsend UN, Bates ST, Lauber CL, et al. Cross-biome metagenomic analyses of soil microbial communities and their functional attributes. Proc Natl Acad Sci. 2012;109:21390–5.
Article
Google Scholar
Breitbart M, Hewson I, Felts B, Mahaffy JM, Nulton J, Salamon P, et al. Metagenomic analyses of an uncultured viral community from human feces. J Bacteriol. 2003;185:6220–3.
Article
CAS
Google Scholar
Edwards RA, Rohwer F. Viral metagenomics. Nat Rev Microbiol. 2005;3:801–5.
Article
Google Scholar
Abbas AA, Diamond JM, Chehoud C, Chang B, Kotzin JJ, Young JC, et al. The perioperative lung transplant virome: torque teno viruses are elevated in donor lungs and show divergent dynamics in primary graft dysfunction. Am J Transplant. 2017;17:1313–24.
Article
CAS
Google Scholar
Emerson JB, Thomas BC, Andrade K, Allen EE, Heidelberg KB, Banfielda JF. Dynamic viral populations in hypersaline systems as revealed by metagenomic assembly. Appl Environ Microbiol. 2012;78:6309–20.
Article
CAS
Google Scholar
Ma Y, Madupu R, Karaoz U, Nossa CW, Yang L, Yooseph S, et al. Human papillomavirus community in healthy persons, defined by metagenomics analysis of human microbiome project shotgun sequencing data sets. J Virol. 2014;88:4786–97 Available from: http://jvi.asm.org/cgi/doi/10.1128/JVI.00093-14.
Article
Google Scholar
Minot S, Bryson A, Chehoud C, Wu GD, Lewis JD, Bushman FD. Rapid evolution of the human gut virome. Proc Natl Acad Sci. 2013;110:12450–5.
Article
CAS
Google Scholar
Meisel JS, Hannigan GD, Tyldsley AS, SanMiguel AJ, Hodkinson BP, Zheng Q, et al. Skin microbiome surveys are strongly influenced by experimental design. J Invest Dermatol The Authors. 2016;136:947–56. https://doi.org/10.1016/j.jid.2016.01.016.
Article
CAS
PubMed
PubMed Central
Google Scholar
Weiss S, Amir A, Hyde ER, Metcalf JL, Song SJ, Knight R. Tracking down the sources of experimental contamination in microbiome studies. Genome Biol. 2014;15:1–3.
Article
Google Scholar
Kim D, Hofstaedter CE, Zhao C, Mattei L, Tanes C, Clarke E, et al. Optimizing methods and dodging pitfalls in microbiome research. Microbiome. 2017;5:1–14.
Article
Google Scholar
Lauder AP, Roche AM, Sherrill-Mix S, Bailey A, Laughlin AL, Bittinger K, et al. Comparison of placenta samples with contamination controls does not provide evidence for a distinct placenta microbiota. Microbiome. 2016;4:1–11. https://doi.org/10.1186/s40168-016-0172-3.
Article
Google Scholar
Nayfach S, Pollard KS. Toward accurate and quantitative comparative metagenomics. Cell Elsevier Inc. 2016;166:1103–16. https://doi.org/10.1016/j.cell.2016.08.007.
Article
CAS
PubMed
PubMed Central
Google Scholar
Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C, Debelius J, et al. Best practices for analysing microbiomes. Nat Rev Microbiol. Springer US. 2018;16:410–22. https://doi.org/10.1038/s41579-018-0029-9.
Article
CAS
PubMed
Google Scholar
Delmont TO, Eren AM. Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies. PeerJ. 2016;4:e1839 Available from: https://peerj.com/articles/1839.
Article
Google Scholar
Kjartansdóttir KR, Friis-Nielsen J, Asplund M, Mollerup S, Mourier T, Jensen RH, et al. Traces of ATCV-1 associated with laboratory component contamination. Proc Natl Acad Sci. 2015;112:E925–6 Available from: http://www.pnas.org/lookup/doi/10.1073/pnas.1423756112.
Article
Google Scholar
Quince C, Walker AW, Simpson JT, Loman NJ, Segata N. Shotgun metagenomics, from sampling to analysis. Nat Biotechnol. 2017;35:833–44.
Article
CAS
Google Scholar
Nasko DJ, Koren S, Phillippy AM, Treangen TJ. RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification. Genome Biol. 2018;19:1–10.
Article
Google Scholar
Naccache SN, Federman S, Veeraraghavan N, Zaharia M, Lee D, Samayoa E, et al. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples. Genome Res. 2014;24:1180–92.
Article
CAS
Google Scholar
Li PE, Lo CC, Anderson JJ, Davenport KW, Bishop-Lilly KA, Xu Y, et al. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform. Nucleic Acids Res. 2017;45:67–80.
Article
CAS
Google Scholar
White RAI, Brown J, Colby S, Overall CC, Lee J-Y, Zucker J, et al. ATLAS (Automatic Tool for Local Assembly Structures) - a comprehensive infrastructure for assembly, annotation, and genomic binning of metagenomic and metatranscriptomic data. Peer J Prepr. 2017;5:e2843v1 Available from: https://peerj.com/preprints/2843.pdf.
KneadData. 2017 [cited 2018 Feb 1]. Available from: https://bitbucket.org/biobakery/kneaddata
Koster J, Rahmann S. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics. 2012;28:2520–2.
Article
Google Scholar
Leinonen R, Sugawara H, Shumway M. The sequence read archive. Nucleic Acids Res. 2011;39:2010–2.
Google Scholar
Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004;5:435–45.
Article
CAS
Google Scholar
Payseur BA, Nachman MW. Microsatellite variation and recombination rate in the human genome. Genetics. 2000;156:1285–98.
CAS
PubMed
PubMed Central
Google Scholar
Subramanian S, Mishra RK, Singh L. Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 2003;4:R13 Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC151303/.
Smit A, Hubley R, Green P. RepeatMasker Open-4.0. 2013; Available from: http://www.repeatmasker.org
Morgulis A, Gertz EM, Schäffer AA, Agarwala R. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J Comput Biol. 2006;13:1028–40 Available from: http://www.liebertonline.com/doi/abs/10.1089/cmb.2006.13.1028.
Article
CAS
Google Scholar
JGI. BBMask. 2018; Available from: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbmask-guide/
Leiby JS, Mccormick K, Sherrill-Mix S, Clarke EL, Kessler LR, Taylor LJ, et al. Lack of detection of a human placenta microbiome in samples from preterm and term deliveries. Microbiome. 2018;6:196.
Clarke EL, Lauder AP, Hofstaedter CE, Hwang Y, Fitzgerald AS, Imai I, et al. Microbial lineages in sarcoidosis: A metagenomic analysis tailored for low-microbial content samples. Am J Respir Crit Care Med. 2018;197:225–34.
Abbas AA, Young JC, Clarke EL, Diamond JM, Imai I, Haas AR, et al. Bidirectional transfer of Anelloviridae lineages between graft and host during lung transplantation. Am J Transplant. 2018; Available from: http://doi.wiley.com/10.1111/ajt.15116.
Clarke EL, Connell AJ, Six E, Kadry NA, Abbas AA, Hwang Y, et al. T cell dynamics and response of the microbiota after gene therapy to treat X-linked severe combined immunodeficiency. Genome Med. 2018;10:70.
Article
Google Scholar
Taylor JM, Lefkowitz E, Clarke EL, Baker K, Lauder A, Kim D, et al. Evaluation of a therapy for Idiopathic Chronic Enterocolitis in rhesus macaques (Macaca mulatta) and linked microbial community correlates. PeerJ. 2018;6:e4612.
Anaconda INC. Conda. 2018. Available from: https://anaconda.org
Taylor LJ, Abbas AA. grabseqs: a utility for easy downloading of reads from next-gen sequencing repositories. 2019. Available from: https://github.com/louiejtaylor/grabseqs
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2015;17:1–3.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.
Article
CAS
Google Scholar
BabrahamBioinformatics. FastQC. 2018. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Li H, Li H, Durbin R, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15:R46.
McDonald D, Clemente JC, Kuczynski J, Rideout JR, Stombaugh J, Wendel D, et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Gigascience. 2012;464:1–6.
Google Scholar
Li D, Luo R, Liu CM, Leung CM, Ting HF, Sadakane K, et al. MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods. Elsevier Inc. 2016;102:3–11. https://doi.org/10.1016/j.ymeth.2016.02.020.
Article
CAS
PubMed
Google Scholar
Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Article
CAS
Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
Article
Google Scholar
Chapman B, Chilton J, Heuer M, Kartashov A, Leehr D, Ménager H, et al. Common workflow language, v1.0. Specification, common workflow language working group. Amstutz P, Crusoe MR, Tijanić N, editors. 2016.
Nurk S, Meleshko D, Korobeynikov APP. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 2017;1:30–47.
Google Scholar
Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun. Nature Publishing Group. 2016;7:1–9. https://doi.org/10.1038/ncomms11257.
Article
CAS
Google Scholar
Truong DT, Franzosa EA, Tickle TL, Scholz M, Weingart G, Pasolli E, et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods. 2015;12:902–3.
Article
CAS
Google Scholar
Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6.
Article
CAS
Google Scholar
Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, et al. Anvi’o: an advanced analysis and visualization platform for ‘omics data. Peer J. 2015;3:e1319 Available from: https://peerj.com/articles/1319.
Article
Google Scholar
Lewis JD, Chen EZ, Baldassano RN, Otley AR, Griffiths AM, Lee D, et al. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn’s disease. Cell Host Microbe. Elsevier Inc. 2015;18:489–500. https://doi.org/10.1016/j.chom.2015.09.008.
Article
CAS
PubMed
PubMed Central
Google Scholar
Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods. 2012;9:811.
Article
CAS
Google Scholar
Bahram M, Hildebrand F, Forslund SK, Anderson JL, Soudzilovskaia NA, Bodegom PM, et al. Structure and function of the global topsoil microbiome. Nature. Springer US. 2018;560:233–7. https://doi.org/10.1038/s41586-018-0386-6.
Article
CAS
PubMed
Google Scholar
McCann A, Ryan FJ, Stockdale SR, Dalmasso M, Blake T, Ryan CA, et al. Viromes of one year old infants reveal the impact of birth mode on microbiome diversity. Peer J. 2018;6:e4694 Available from: https://peerj.com/articles/4694.
Article
Google Scholar
Spandole S, Cimponeriu D, Berca LM, Mihăescu G, Miha G. Human anelloviruses: an update of molecular, epidemiological and clinical aspects. Arch Virol. 2015;160:893–908.
Article
CAS
Google Scholar
Hillmann B, Al-ghalith GA, Shields-Cutler RR, Zhu Q, Gohl DM, Beckman KB, et al. Evaluating the information content of shallow shotgun metagenomics Benjamin. mSystems. 2018;3:1–12.
Article
Google Scholar
Breitwieser FP, Salzberg SL. Pavian: interactive analysis of metagenomics data for microbiomics and pathogen identification. bioRxiv. 2016:084715 Available from: https://www.biorxiv.org/content/early/2016/10/31/084715.
Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110:462–7.
Article
CAS
Google Scholar
Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, et al. The Dfam database of repetitive DNA families. Nucleic Acids Res. 2016;44:D81–9.
Article
CAS
Google Scholar
Hu X, Yuan J, Shi Y, Lu J, Liu B, Li Z, et al. pIRS: profile-based Illumina pair-end reads simulator. Bioinformatics. 2012;28:1533–5.
Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, et al. The consensus coding sequence (CCDS) project: identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 2009;19:1506.
Article
CAS
Google Scholar
Coordinators NR. Database resources of the National Center for biotechnology information. Nucleic Acids Res. 2015;44:7–19.
Google Scholar
Clarke EL, Taylor LJ, Zhao C, Connell A, Lee J-J, Fett B, et al. Example data for “Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments” [Version 2]. Zenodo. 2019;
GNU Time. Available from: https://www.gnu.org/software/time/
Conway JR, Lex A, Gehlenborg N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics. 2017;33:2938–40.
Article
CAS
Google Scholar
Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, et al. vegan: community ecology package. 2018.
Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2016.
Book
Google Scholar
JGI. Tadpole. 2018. Available from: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/tadpole-guide/
Lo CC, Chain PSG. Rapid evaluation and quality control of next generation sequencing data with FaQCs. BMC Bioinformatics. 2014;15:1–8.
Article
Google Scholar
JGI. BBDuk. 2018. Available from: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research. 1999;27:573–80.
Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. Peer J. 2016;4:e2584 Available from: https://peerj.com/articles/2584.
Article
Google Scholar
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
Article
CAS
Google Scholar
JGI. BBMap. 2018. Available from: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbmap-guide/
Zaharia M, Bolosky WJ, Curtis K, Fox A, Patterson D, Shenker S, et al. Faster and more accurate sequence alignment with SNAP. arXiv. 2011;1111:e5572v1.
Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol. 2018;14:1–14.
Article
Google Scholar
Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH. JBrowse: a next-generation genome browser. Genome Res. 2009;19:1630–8.
Article
CAS
Google Scholar
Freitas TAK, Li P-E, Scholz MB, Chain PSG. Accurate read-based metagenome characterization using a hierarchical suite of unique signatures. Nucleic Acids Res. 2015;43:e69.
Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2014;12:59–60.
Article
Google Scholar
Price MN, Dehal PS, Arkin AP. FastTree 2 - approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5:e9490.
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77 Available from: http://online.liebertpub.com/doi/abs/10.1089/cmb.2012.0021.
Article
CAS
Google Scholar
Treangen TJ, Sommer DD, Angly FE, Koren S, Pop M. Next generation sequence assembly with AMOS. Curr Protoc Bioinforma. 2011;33:11.8.1–11.8.18.
Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–9.
Article
CAS
Google Scholar
Jensen LJ, Julien P, Kuhn M, von Mering C, Muller J, Doerks T, et al. eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res. 2008;36:250–4.
Article
Google Scholar
Bairoch A. The ENZYME database in 2000. Nucleic Acids Res. 2000;28:304–5 Available from: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/28.1.304.
Article
CAS
Google Scholar
Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y. DbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 2012;40:445–51.
Article
Google Scholar
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3-new capabilities and interfaces. Nucleic Acids Res. 2012;40:1–12.
Article
Google Scholar
Ye Y, Choi JH, Tang H. RAPSearch: a fast protein similarity search tool for short reads. BMC Bioinformatics. 2011;12:159.
Stamatakis A, Ludwig T, Meier H. RAxML-II: a program for sequential, parallel and distributed inference of large phylogenetic trees. Concurr Comput Pract Exp. 2005;17:1705–23.
Article
Google Scholar
Ahmed SA, Lo C-C, Li P-E, Davenport KW, Chain PSG. From raw reads to trees: whole genome SNP phylogenetics across the tree of life. bioRxiv. 2015:032250 Available from: http://biorxiv.org/content/early/2015/11/19/032250.abstract.