In-depth resistome analysis by targeted metagenomics
- Val F. Lanza1, 2, 3, 4,
- Fernando Baquero1, 2, 3,
- José Luís Martínez2, 4,
- Ricardo Ramos-Ruíz5,
- Bruno González-Zorn6,
- Antoine Andremont7,
- Antonio Sánchez-Valenzuela1,
- Stanislav Dusko Ehrlich8, 9,
- Sean Kennedy8, 13,
- Etienne Ruppé7, 8,
- Willem van Schaik10, 14,
- Rob J. Willems10,
- Fernando de la Cruz11, 12 and
- Teresa M. Coque1, 2, 3Email author
© The Author(s). 2018
Received: 13 March 2017
Accepted: 17 December 2017
Published: 15 January 2018
Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of “resistomes” (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems. However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes.
We developed ResCap, a targeted sequence capture platform based on SeqCapEZ (NimbleGene) technology, which includes probes for 8667 canonical resistance genes (7963 antibiotic resistance genes and 704 genes conferring resistance to metals or biocides), and 2517 relaxase genes (plasmid markers) and 78,600 genes homologous to the previous identified targets (47,806 for antibiotics and 30,794 for biocides or metals). Its performance was compared with metagenomic shotgun sequencing (MSS) for 17 fecal samples (9 humans, 8 swine). ResCap significantly improves MSS to detect “gene abundance” (from 2.0 to 83.2%) and “gene diversity” (26 versus 14.9 genes unequivocally detected per sample per million of reads; the number of reads unequivocally mapped increasing up to 300-fold by using ResCap), which were calculated using novel bioinformatic tools. ResCap also facilitated the analysis of novel genes potentially involved in the resistance to antibiotics, metals, biocides, or any combination thereof.
ResCap, the first targeted sequence capture, specifically developed to analyze resistomes, greatly enhances the sensitivity and specificity of available metagenomic methods and offers the possibility to analyze genes related to the selection and transfer of antimicrobial resistance (biocides, heavy metals, plasmids). The model opens the possibility to study other complex microbial systems in which minority populations play a relevant role.
Antimicrobial resistance is considered a major global health challenge that has been recently included in the agendas of major international bodies . The adoption of measures to address the antibiotic resistance crisis  is impaired by the controversy over what resistance is and how and where it should be detected and analyzed [3–5].
Metagenomic methods are increasingly being used to analyze the ensemble of genes that may encode antibiotic resistance in various microbial ecosystems, which are defined as the resistome [6–17]. An important hurdle confronting current resistome analyses is low sensitivity in the detection of minority populations harboring resistance genes (often present at concentrations below the detection level of the methods used)  and/or low specificity in the identification of allelic variants that might confer different susceptibility phenotypes . Such requirements are needed for prioritizing risks of antibiotic resistance genes in metagenomes in terms of public health [20, 21].
The challenge of achieving a sensitive and specific identification of genes influencing antibiotic resistance in a complex metagenome background parallels the difficulties found by scientists studying human inherited diseases years ago . In that case, the limitations of available sequencing technologies were overcome by using capture-based or targeted sequencing strategies of all protein-coding regions (exome). Such approaches reduced the number of sequences to be screened, and therefore represented a cost- and time-effective and high-throughput alternative to the metagenomic technologies for analyzing exomes [22, 23]. In-solution targeted capture platforms (TCPs) also provide technical improvements over array-based platforms and other genome-partitioning approaches in terms of scalability, cost-effectiveness, and enhanced data quality, including lower variance in target coverage, more accurate single nucleotide polymorphism [SNP] calling, higher reproducibility and better quality assembly . Although TCPs are mostly used for the diagnosis of human inherited diseases , the methodology offers tremendous potential for boosting advances in environmental and ecological studies which requires the isolation of sequences of interest from a mixture of DNA from a complex community of organisms .
This study reports the development and validation of the first TCP for the analysis of bacterial resistomes, which was designated ResCap (for Resistome Capture). We show that ResCap could significantly improve the sensitivity and specificity over previous metagenomic analysis in the detection of antibiotic resistance. ResCap also allows the analysis of the presence and diversity of genes conferring resistance to other antimicrobials (heavy metals and biocides), which are frequently co-selected with antibiotic resistance genes and genes from replicons of mobile genetic elements (such as plasmids). An ad hoc advanced bioinformatics pipeline, developed in parallel, exploits the capabilities of ResCap for comparative metagenomic analysis. The metagenomic approach described here paves the way for a future series of applications in the studies directed to the identification, epidemiological surveillance, ecology, and study of evolutionary trajectories of resistance genes in complex microbial environments.
The ResCap capture library is a homemade core reference database (which will be available upon request) that comprises both well-known and hypothetical genes encoding resistance to antimicrobials (antibiotics, heavy metals, biocides) and genes coding for relaxases, enzymes involved in the process of DNA mobilization and transfer of conjugative elements that use to carry antimicrobial resistance genes (plasmids, conjugative transposons). The core reference database was built by downloading sequences associated with non-redundant antimicrobial genes available in the curated databases Arg-ANNOT , CARD , RED-DB (http://www.fibim.unisi.it/REDDB/Default.asp), ResFinder , and Bacmet . These antibiotic resistance databases were combined within a non-redundant set. Proteins were clustered in protein families by homology, using CD-HIT with parameters of 80% identity and 80% coverage. Each protein family was aligned by MUSCLE v. 3.7  with default parameters and a hidden Markov model (HMM) was built for each family with hmmbuild function of the HMMER3  using default parameters. Hmmer search function (hmmsearch) was used against UniProtDB for each HMM profile to search homologous proteins for each family of proteins that confer antibiotic resistance. Manual curation of the database, which consisted on reviewing the annotation and the score of the search of datasets, allowed removing false positives. The proteins of the final data set were translated to a DNA sequences using ENA accession numbers associated with each UniProtDB entry.
The final ResCap-targeted sequence panel consists of 78,600 non-redundant genes (81,117 redundant genes) with a target space of 88.13 Mb, not yet reaching the 200 Mb target capacity offered by the custom SeqCap EZ library format (NimbleGen, Madison, USA). Probes targeting the antibiotic resistome include 47,806 putative antibiotic resistance genes and 7963 functionally characterized genes which are designed here as “canonical, antibiotic resistance genes.” Probes targeting the metal and biocide resistome include 30,794 putative resistance genes and 704 canonical resistance genes. The platform also includes probes for 2517 relaxases genes of the ConjDB database , which are used for plasmid identification and classification) . In addition to the 8667 genes that confer functionally proven resistance to these antimicrobials (canonical genes), the platform also includes targets for 78,600 resistance gene homologs (47,806 for antibiotic and 30,794 for biocide and metal resistance).
We submitted the consolidated list of target sequences to Roche NimbleGen (Madison, USA) for capture, library design, and synthesis, which was further implemented under the custom NimbleGen SeqCap EZ Developer Library format. Redistribution of probes for better capture uniformity, redundancy and comprehensive target base coverage relied on NimbleGen based on patented algorithms. ResCap design covers 98.3% of the 88.13 Mb, and 99.6% of the genes have more than 50% of their sequence covered [Additional file 1].
The ResCap workflow
The ResCap workflow consists of (i) whole-metagenome shotgun library construction; (ii) hybridization and capture, (iii) captured DNA sequencing. All the steps were performed according to NimbleGen standard protocols for Illumina platforms. To evaluate ResCap efficiency, the samples were sequenced before and after capture.
i) Whole-metagenome shotgun library construction. Total nucleic acid was extracted following the standardized Metahit protocol  (http://www.metahit.eu/) and using the FastPrep instrument (MP Biomedicals, USA). Libraries were prepared following the instructions of the Kapa Library Preparation Kit for Illumina platforms (Kapabiosystems, KR0935-v1.13). Briefly, 1.0 μg input DNA (measure by Picogreen) was fragmented to 500–600 bp insert size by sonication with Bioruptor (FastPrep®-24). After end repair, A-tailing, and adapter ligation, we followed Dual-SPRI size selection, adding 0.5 vol in the first cut and 0.2 vol in the second cut to achieve 650–750 bp libraries. Library amplification was performed using an LM-PCR of 7 cycles, as indicated in the SeqCap EZ Library SR User’s Guide v4.2. At this level, samples were labeled with specific barcodes for further sample identification. A first aliquot of the resulting amplified libraries was quality checked on a Bioanalyzer 2100 (Agilent) and pooled in equimolecular amounts for sequencing on an Illumina HiSeq 2000 instrument, generating 100–150 bp paired-end reads (pre-capture samples).
ii) Hybridization and capture. The second part of each DNA library was subjected to targeted sequence capture with the custom ResCap probes prior to sequencing (post-capture samples). Both experiments were made in separate sequencing runs. Targeted sequence capture was performed according to the manufacturer’s specifications. The captured DNA was checked for quality and integrity on a Bioanalyzer and titrated by quantitative polymerase chain reaction using the Kapa-SYBR FAST qPCR kit for LightCycler480 and a reference standard for quantification.
iii) Captured DNA sequencing. The captured libraries were denatured prior to being loaded on a flow-cell at a density of 2.2 pM, where clusters were formed and sequenced using a HiSeq 2000 in a 2 × 100 paired-end mode for swine samples and NextSeq 500 in a 2 × 150 paired-end mode for human samples. Raw sequences were processed using the FastX Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) with a quality cutoff of 20 and reads shorter than 100 and 150 bp, respectively, being discarded.
Raw data processing
Reads were mapped against our database, comprising ARG-ANNOT , BacMET , and ConjDB  databases independently, using Bowtie2 software . Bowtie2 was set up to retrieve all end-to-end possible alignments and to suppress both discordant alignments and mixed alignments. The output SAM file was parsed to get the fields of Query template NAME, Reference sequence NAME, 1-based leftmost mapping Position, MAPping Quality, Position of the mate/next read. Reads with unavailable information (field Query Template NAME equal to ‘*’) were suppressed. Subsequently, a homemade Perl script was used to count matched reads per gene. Using the SAM parsed file and the length of the reference genes, the Perl script generated a table with the following fields: (i) the number of reads per gene mapped (“RPG”, gene depth coverage); (ii) the number of reads per kb of gene (“RPK”); (iii) the number of the reads that were mapped unequivocally to a given gene (“unique”); and (iv) the percentage of coverage of the gene sequence (“horizontal gene alignment coverage”) of each mapped gene. Table fields RPG, RPK, and unique, were normalized by the total amount of reads in each sample, transforming such fields in reads per gene per million reads and RPKM, respectively, the last being a common unit of gene abundance . Several ways of normalizing abundance data have been applied to different studies (e.g., expression data in RNA-Seq experiments). The aim of our approach was to estimate the proportion of antimicrobial resistance genes among samples that putatively contained the same amount of DNA; thus, normalization using the total amount of DNA (i.e., reads) in the samples fits better with the initial approach.
Each cluster of the allele network represents the set of alleles/genes detected by a set of reads and was defined as “mapping gene clusters” (MGCs). Each MGC can include hundreds of genes or just one gene, and will be detected when at least one read maps against any of the genes within that MGC. Due to allele network are constructed for the whole study, they constitute a set of normalized variables that allow performing qualitative and quantitative comparisons between the samples included in this particular analysis. To quantify the MGCs in each sample, the highest value shown by an allele (node) within a given MGC is the occurrence of such an MGC (abundance). This criterium is used to avoid an over and underestimation of data that would occur when using the mean or median of the reads corresponding to alleles with high homology (thus, sharing a high proportion of reads) but present in different proportions. Therefore, the MGC approach builds a unique variable for each set of possible detected alleles. Figure 2 shows the 839 MGCs of our sample (237 for AbR, 283 for biocides and metals and 319 for relaxases).
A homemade Perl script was used to build the allele network from the SAM parsed files, taking the mapped genes as nodes and searching the ambiguously mapped reads to create the edges. The Perl script calculates the edges’ weight as the number of reads that map the linked nodes at the same time. The allele network was loaded into the R environment  using the igraph package  MGCs were defined using mcl, from the MCL R package , with default parameters except allowing loops and clusters with only one member in the allele network.
The resistome of a given experiment was analyzed in terms of gene abundance and gene diversity according to the methodology described above. The abundance and the diversity of genes in a particular resistome are the (dependent) variables that define this resistome, which are measured as the number of RPKMs per MGC and the number of MGCs per million reads (MPM), respectively.
This MGC system builds a set of normalized variables that allow analyzing abundance and diversity within and between samples, and thus the comparison of datasets from various sources. The MGCs of the antibiotic resistance gene database were divided according to antibiotic families . The MGCs of the relaxase database were organized in known different relaxase families [33, 41]. The MGCs of the biocide and heavy metal resistance gene database were classified according to susceptibility to specific compounds . Biocide and heavy metals resistance genes that belong to more than one functional category (e.g., genes conferring resistance to various metals or genes encoding resistance to different biocides) contribute equally for any of them. Descriptive statistics were performed using the dplyr , tidyr , and ggplot2  packages of R .
A statistical analysis of gene abundance, analogous to that used for comparing the abundance of mRNA among samples in differential expression analysis , was performed to quantify the relative abundance of the MGCs. Differential analysis was performed using the DESeq2 package . Tables containing the original abundance data obtained by ResCap and MSS data sets were used as input for the DESeq2 package. Normalization and statistical analysis were performed with the default parameters of DESeq2. MGCs were classified as differentially detected (only present in either human or swine samples, p value ≤ 10−3) or commonly detected (present in both human and swine samples, p value ≥ 10−3) in different hosts. This p value ≤ 10−3 is also used as cutoff is other differential analysis.
ResCap includes probes for approximately 78,600 genes that are homologous to known resistance genes, with different degrees of sequence identity, which might be involved in antimicrobial resistance. This ensemble of genes has also been considered in the definition of resistome . Assemblies were performed by MegaHit software with default parameters . Prodigal  was used for gene recognition and translation with the specific parameters for metagenomic sequences. Quality assemblies’ quantification was performed by Quast software . Predicted genes were first annotated against the ResCap database by Best Blast Hit approach using blastn software . To identify only the genes belonging to the ResCap database or their homologs and to minimize the false positive ratio, Blast hits were filtered by e value of 10−100 and 80% of reference coverage. Genes with identities higher than 95% and coverage higher than 80% were cataloged as belonging to ResCap database (ResCapDB). The remaining sequences were translated to proteins. Proteins were compared against UniProtKB database by blastp. Again, hits with higher identity than 95%, coverage higher than 80% and e value lower than 10−100 were considered as UniProtKB known proteins. The set of remaining proteins that did not accomplish this threshold were cataloged as novel proteins. Additional file 2 shows this analysis workflow.
ResCap was validated by analyzing fecal samples from nine humans and eight swine, all collected as part of the FP7 European Research Consortium EvoTAR (http://www.evotar.eu). Independent fecal samples from swine were collected on Spanish farms linked to large companies that supply broilers and swine processed meat to the EU. Antibiotics as growth promoters or with preventive purposes are not used on these farms. Fecal human samples were collected at the Hospital Bichat, Paris, France, under the protocol approved by its local ethics committee. The DNA extraction from the fecal samples analyzed was performed according standardized protocols (MetaHIT Protocol; http://www.metahit.eu/). The robustness of the platform was tested by comparative analysis of two technical replicates of two swine samples.
ResCap performance was compared with MSS in two ways. First, by applying a reference-based approach that maps metagenome reads against specific databases (AbR, metals and biocides, and relaxases). Second, by applying a reference-free approach that assembles metagenomic reads and performs a functional annotation. The results of these evaluations appear below.
This section addresses how the abundance and diversity of resistance genes (ResCap or those already validated) were calculated.
ResCap achieves better recovery of target genes than MSS
Summary of mapping results. Comparison of ResCap against MSS technology. The mapping ratios (on-target value) were extracted from the number of reads mapped (SAM files) divided by the total number of reads. The “gain values” are the results of dividing ResCap on-target value by the MSS on-target value
Metagenome shotgun sequence MSS
Furthermore, a significant portion of genes detected by ResCap (42%, 975/2323), was not detected by MSS. The linearity of the system was evaluated by using a linear regression model for the genes only detected in each paired sample (MSS vs. ResCap). An R2 mean of 0.813 (0.85–0.99) shows a good match between both protocols.
The enrichment of canonical resistance genes when using ResCap was similar in samples from humans and swine. Further, the observed differences in the relative abundance of genes encoding resistance to antimicrobials (antibiotics, heavy metals, biocides) and relaxases in various samples (Fig. 3b, c) could be explained by the variability of microbiota from different hosts [50, 51].
ResCap addresses gene diversity
Allele redundancy of some resistance genes hinders the correct estimation of gene diversity and precludes a correct estimation of gene abundance in metagenomes when using most available metagenomic tools. To overcome this issue, we defined a mapping gene cluster (MGC) as a group of alleles/genes detected by the same set of reads and normalized such MGCs by the total number of sequencing reads per each sample expressed in millions of reads (MGC Per Million or MPM) in order to use it as unit of diversity that allow us comparing different samples. The number of MPMs increased 1.3-fold in humans (0.7–1.74, p-value 2·10−1) and 2.1-fold (2.3–1.9, p-value 2·10−4) in swine when using ResCap instead of MSS (Fig. 3b, c).
The robustness of the platform was ascertained by using replicates of swine samples. The correlation between replicates measured by Pearson’s linearity model test was 0.98 (p value 2.2 · 10−16) and 0.89 (p value 2.2 · 10−16), respectively [Additional file 9]. This result is within the range specified by NimbleGene for other SeqCapEZ capture platforms.
Comparative analysis of resistomes from various samples
Figure 7b represents the distribution of the MGCs in different hosts by using MSS and ResCap. Using MSS, the resistome of the total samples analyzed comprises 525 MGCs, 88 MGCs being differentially detected (60 MGCs from humans and 28 MGCs from swine), and 437 being commonly detected in these hosts. Conversely, ResCap detected 831 MGCS, 262 classified as differentially detected (186 from humans and 76 from swine) and 569 as commonly detected. The data reflects a 3-fold increase in the number of MGCs using ResCap in comparison with MSS.
Figure 7c compares the results obtained by ResCap and MSS for both the differentially detected and the commonly detected MGCs in different hosts (262 vs. 88, and 569 vs. 437, respectively).
Of all the MGCs differentially detected (262 by ResCap plus 88 by MSS), 185 were only differentially detected by ResCap, 77 were differentially detected by both approaches, and 11 were only differentially detected by MSS. The average of reads used for the statistical test of these 11 MGCs was 18 using MSS (ranging from 6.02 to 78.3 reads) and 2907 using ResCap (ranging from 1884 to 3947 reads), which make us to suggest these 11 MGCs could be potential false positives.
Of all MGCs commonly detected in both types of hosts (569 by ResCap and 437 by MSS), both methods detected 269 MGCs while 300 MGCs were only detected by ResCap and 168 MCGs were only detected by MSS. Again, the low number of useful reads might explain the higher p values of MSS over ResCap and identify this set of MGCs as common MGCs between humans and swine.
This study reports the development of a novel targeted gene capture platform, ResCap, and its comparative evaluation with MSS in resistance gene identification using a collection of human and swine fecal samples. The results show that ResCap is suited for high-resolution analysis of resistomes and offers the possibility to detect genes homologous to known resistance genes, which will allow performing further comprehensive analyses on the diversity and the evolution of antibiotic resistance .
ResCap also provides several technical advantages to study resistomes in comparison with current metagenomic methods. First, the enrichment of ResCap resides in its targeted metagenomics approach, which significantly increases the recovery of resistance gene sequences. Our results indicate that the resistome represents barely 0.2% of the gut metagenome. As a consequence, MSS would need at least 3.75 · 109 reads per sample to reach a similar coverage to that obtained by using ResCap (average of 1.9 · 107 paired reads, which represents a relative enrichment of 279×). Second, the tiling of capture probes greatly facilitates the higher level of horizontal gene alignment coverage of ResCap as compared to MSS, resulting in increased specificity. Third, ResCap’s ability to detect previously unrecognized DNA fragments with homology to canonical resistance genes will facilitate the discovery of novel genes potentially involved in antimicrobial resistance. In case they were selectable, such novel genes would be enriched in the presence of antimicrobials, an important point to be tested during clinical trials. In addition, ResCap is of interest in public health, because it allows a more accurate risk ranking analysis  of the genes within the resistomes of various microbiota. Finally, the substantial capacity of the platform (200 Mb) makes ResCap extensible up to 2-fold of its current capacity, providing opportunities for updating the platform with probes for newly published resistance genes or for resistance genes added to resistance gene databases. ResCap updates will be publicly available through the GitHub repository (https://github.com/valflanza/ResCap) and the NimbleGene webpage. Nonetheless, the threshold of ResCap detection remains unknown due to the lack of a negative control that demonstrates the ability of ResCap to pick antibiotic resistance genes from quantified minority populations (e.g., mock genomic populations). Although appropriate, the complexity and variability of the metagenomic samples makes it difficult to use a good negative control for this type of study.
The parameters to express “gene abundance” and “gene diversity” allow comparing the resistomes of various samples. Relative abundance parameters are widely used in computational analysis of MSS datasets but require specialized statistics, because these compositional parameters are influenced by the variability in metagenomes of different samples. The novel “mapping gene cluster” (MGC) concept allows to provide a set of normalized variables that can be measured in terms of abundance and diversity among samples. Furthermore, MGC permits the evaluation of diversity within and among various functional groups (in our case, families of antibiotics, groups of genes conferring resistance to heavy metals or biocides and plasmid relaxases). To date, few quantitative metagenomic approaches to analyze resistomes are available, and they do not achieve this level of accuracy [14, 16].
Because of its sensitivity, specificity, and the possibility to compare results between samples, ResCap complies with the needs of public health epidemiology of antibiotic resistance that include (i) the detection of antibiotic resistance threats in various microbial environments [52, 53]; (ii) the need for implementation of accurate risk assessment studies based on resistome analysis in healthy humans, hospitalized patients, animal husbandry, the food industry, and the environment [52, 54]; (iii) the quality control of sewage and water body decontamination of antibiotic resistant genes [5, 54]; (iv) the update and refining of the list of resistance genes to be considered in monitoring the adverse effects of drugs in microbiomes, including pharmacomicrobiomic applications in clinical trials; (v) the close monitoring of the efficacy of microbiome reconstitution/rebiosis, whether through targeted probiotic live culture administration or fecal microbiota transplantation, to alleviate the adverse impact of antibiotic administration; and (vi) to analyze the effect of eco-evo drugs and strategies to combat antibiotic resistance .
This study constitutes the first description of targeted metagenomics to analyze antimicrobial resistance. ResCap, the novel capture platform developed, allows meeting the challenge of analyzing samples with a complex and heterogeneous mix of genes in low and high concentration DNA samples with a high level of specificity and to further explore the presence of novel genes. Thus, ResCap-like approaches might also be used to identify other sequences in minority bacterial populations that are part of complex microbial systems, such as virulence determinants, key ecological traits involved in biosynthesis or biodegradation, or relevant genes of biotechnological interest.
We are grateful for the technical support of Susana Ovalle-Andreu, Genomics Unit, Madrid Science Park (https://fpcm.es/en), Cantoblanco, Madrid, Spain.
This study was supported by the European Commission, Seven Framework Program (EVOTARFP7-HEALTH-282004 for VFL, FB, JLM, AA, DE, ER, RJLW, WvS, FdlC, and TMC), the Joint Programming Initiative in Antimicrobial Resistance (JPIAMR Third call, STARCS, JPIAMR2016-AC16/00039 to TMC, RJLW, WvS), the Joint Programming Initiative in Water (JPI Water StARE JPIW2013-089-C02-01 to JLM) and the Ministry of Economy and Competitiveness of Spain (BIO2014-54507-R to JLM, and PLASWIRES-612146/FP7-ICT-2013-10 and BFU2014-55534-C2-1-P for FdlC). The authors also acknowledge the European Development Regional Fund “A way to achieve Europe” (ERDF) for co-founding the Spanish R&D National Plan 2012-2019 (BIO2014-54507-R to JLM, PI15-0512 to TMC, PI15-00818 to FB, and BFU2014-55534-C2-1-P to FdlC), CIBER (CIBER in Epidemiology and Public Health, CIBERESP; CB06/02/0053 to FB), the Spanish Network for Research on Infectious Diseases (REIPI RD12/0015 to JLM) and the Regional Government of Madrid (InGeMICS- B2017/BMD-3691). Val F. Lanza was further funded by a Research Award Grant 2016 of the European Society for Clinical Microbiology and Infectious Diseases (ESCMID). Additional funding was from the Metagenopolis grant ANR-11-DPBS-0001 to DE.
Availability of data and materials
Metagenomic sequencing data was deposited in the NCBI Bioproject database (accessions: PRJNA371649 and PRJNA371647). Homemade scripts, raw data tables, and databases are available at https://github.com/valflanza/ResCap. The designs of ResCap platform and the NimbleGene ID are available upon request to the authors.
TMC, FB, and JLM designed the study, participated in the analysis of data, and wrote the manuscript. VFL developed the platform design and the bioinformatics tools, participated in the analysis of data, and wrote the manuscript. RR designed the sequencing workflow and participated in the revision of the manuscript. FdlC participated in the design of the study, the analysis of data, and the revision of the manuscript. AS-V, BG-Z, and AA provided expertise and participated in the collection and/or preparation of the samples. DE, SK, ER, WvS, and RJW provided expertise and participated in the revision of the manuscript. All authors read and approved the final version of this manuscript.
Ethics approval and consent to participate
Clinical samples were used after obtaining the approval of the study by the Institutional Review Board of the Hospital Bichat, and all written informed consents from the enrolled subjects, in compliance with national legislation and the Code of Ethical Principles for Medical Research Involving Human Subjects of the World Medical Association (Declaration of Helsinki).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Årdal C, Outterson K, Hoffman SJ, Ghafur A, Sharland M, Ranganathan N, Smith R, Zorzet A, Cohn J, Pittet D, Daulaire N, Morel C, Rizvi Z, Balasegaram M, Dar OA, Heymann DL, Holmes AH, Moore LS, Laxminarayan R, Mendelson M, Røttingen JA. International cooperation to improve access to and sustain effectiveness of antimicrobials. Lancet. 2016;387:296–307.View ArticlePubMedGoogle Scholar
- Årdal C, Baraldi E, Ciabuschi F, Outterson K, Rex JH, Piddock LJV, Findlay D; DRIVE-AB Steering Committee. To the G20: incentivising antibacterial research and development. Lancet Infect Dis. 2017:799-801.Google Scholar
- Dar OA, Hasan R, Schlundt J, Harbarth S, Caleo G, Dar FK, Littmann J, Rweyemamu M, Buckley EJ, Shahid M, Kock R, Li HL, Giha H, Khan M, So AD, Bindayna KM, Kessel A, Pedersen HB, Permanand G, Zumla A, Røttingen JA, Heymann DL. Exploring the evidence base for national and regional policy interventions to combat resistance. Lancet. 2016;387(10015):285-95.Google Scholar
- Bax R, Bywater R, Cornaglia G, Goossens H, Hunter P, Isham V, et al. Surveillance of antimicrobial resistance--what, how and whither? Clin Microbiol Infect. 2001;7:316–25.View ArticlePubMedGoogle Scholar
- Laxminarayan R, Duse A, Wattal C, Zaidi AKM, Wertheim HFL, Sumpradit N, et al. Antibiotic resistance-the need for global solutions. Lancet Infect Dis. 2013;13:1057–98.View ArticlePubMedGoogle Scholar
- D’Costa VM. Sampling the antibiotic Resistome. Science (80- ). 2006;311:374–7.View ArticleGoogle Scholar
- Wright GD. The antibiotic resistome: the nexus of chemical and genetic diversity. Nat. Rev. Microbiol. 2007;5:175–86.View ArticlePubMedGoogle Scholar
- Wright GD. The antibiotic resistome. Expert Opin Drug Discov. 2010;5:779–88.View ArticlePubMedGoogle Scholar
- Forsberg KJ, Patel S, Gibson MK, Lauber CL, Knight R, Fierer N, et al. Bacterial phylogeny structures soil resistomes across habitats. Nature Nature Publishing Group. 2014;509:612–6.Google Scholar
- Forsberg KJ, Reyes A, Wang B, Selleck EM, Sommer MOA, Dantas G. The shared antibiotic Resistome of soil bacteria and human pathogens. Science (80-. ). 2012;337:1107–11.View ArticleGoogle Scholar
- Sommer MOA, Dantas G, Church GM. Functional characterization of the antibiotic resistance reservoir in the human microflora. Science. 2009;325:1128–31.View ArticlePubMedPubMed CentralGoogle Scholar
- Modi SR, Lee HH, Spina CS, Collins JJ. Antibiotic treatment expands the resistance reservoir and ecological network of the phage metagenome. Nature. 2013;499:219–22.View ArticlePubMedPubMed CentralGoogle Scholar
- Bengtsson-Palme J, Boulund F, Fick J, Kristiansson E, Larsson DGJ. Shotgun metagenomics reveals a wide array of antibiotic resistance genes and mobile elements in a polluted lake in India. Front Microbiol. Frontiers. 2014;5:648.Google Scholar
- Hu Y, Yang X, Qin J, Lu N, Cheng G, Wu N, et al. Metagenome-wide analysis of antibiotic resistance genes in a large cohort of human gut microbiota. Nat Commun. 2013;4:2151.PubMedGoogle Scholar
- Ghosh TS, Gupta S. Sen, Nair GB, Mande SS. In silico analysis of antibiotic resistance genes in the gut microflora of individuals from diverse geographies and age-groups. PLoS One. 2013;8:e83823.View ArticlePubMedPubMed CentralGoogle Scholar
- Forslund K, Sunagawa S, Kultima JR, Mende DR, Arumugam M, Typas A, et al. Country-specific antibiotic use practices impact the human gut resistome. Genome Res. 2013;23:1163–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Perry JA, Westman EL, Wright GD. The antibiotic resistome: What’s new? Curr Opin Microbiol. 2014;21:45–50.View ArticlePubMedGoogle Scholar
- Lynch MDJ, Neufeld JD. Ecology and exploration of the rare biosphere. Nat. Rev. Microbiol. 2015;13:217–29.View ArticlePubMedGoogle Scholar
- Zhou J, He Z, Yang Y, Deng Y, Tringe SG, Alvarez-cohen L. High-throughput metagenomic Technologies for Complex Microbial Community Analysis: open and closed formats. MBio. 2015;6:1–17.View ArticleGoogle Scholar
- Martínez JL, Coque TM, Baquero F. Prioritizing risks of antibiotic resistance genes in all metagenomes. Nat. Rev. Microbiol. 2015;13:396.View ArticlePubMedGoogle Scholar
- Martínez JL, Coque TM, Baquero F. What is a resistance gene? Ranking risk in resistomes. Nat Rev Microbiol. 2014;13:116–23.Google Scholar
- Olson M. Enrichment of super-sized resequencing targets from the human genome. Nat Methods. 2007;4:891–2. Google Scholar
- Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, et al. Target-enrichment strategies for next-generation sequencing. Nat Methods. 2010;7:111–8.View ArticlePubMedGoogle Scholar
- Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461:272–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Jones MR, Good JM. Targeted capture in evolutionary and ecological genomics. Mol Ecol. 2016;25:185–202.View ArticlePubMedGoogle Scholar
- Gupta SSK, Padmanabhan BBR, Diene SM, Lopez-Rojas R, Kempf M, Landraud L, et al. ARG-ANNOT, a new Bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob Agents Chemother. 2014;58:212–20.View ArticlePubMedPubMed CentralGoogle Scholar
- McArthur AG, Waglechner N, Nizam F, Yan A, Azad M a, Baylay AJ, et al. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother. 2013;57:3348–57.View ArticlePubMedPubMed CentralGoogle Scholar
- Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–4.View ArticlePubMedPubMed CentralGoogle Scholar
- Pal C, Bengtsson-Palme J, Rensing C, Kristiansson E, Larsson DGJ. BacMet: antibacterial biocide and metal resistance genes database. Nucleic Acids Res. 2014;42:D737–43.View ArticlePubMedGoogle Scholar
- Edgar RC. MUSCLE multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol. 2011;7:e1002195.Google Scholar
- Guglielmini J, Néron B, Abby SS, Garcillán-Barcia MP, de la Cruz F, Rocha EPC. Key components of the eight classes of type IV secretion systems involved in bacterial conjugation or protein secretion. Nucleic Acids Res. 2014;42:5715–27.View ArticlePubMedPubMed CentralGoogle Scholar
- Garcillán-Barcia MP, Francia MV, de la Cruz F. The diversity of conjugative relaxases and its application in plasmid classification. FEMS Microbiol Rev. 2009;33:657–87.View ArticlePubMedGoogle Scholar
- Wesolowska-Andersen A, Bahl MI, Carvalho V, Kristiansen K, Sicheritz-Pontén T, Gupta R, et al. Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomic analysis. Microbiome. 2014;2:19.View ArticlePubMedPubMed CentralGoogle Scholar
- Pongor LS, Vera R, Ligeti B. Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification. PLoS One. 2014;9(7):e103441.Google Scholar
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8.Google Scholar
- R Development Core Team. R: a language and environment for statistical computing. Vienna. R Found. Stat. Comput. Vienna: R Foundation for Statistical Computing; 2014.Google Scholar
- Bastian M, Heymann S, Jacomy M. Gephi: An Open Source Software for Exploring and Manipulating Networks. Third Int AAAI Conf Weblogs Soc Media. 2009;361–2.Google Scholar
- Jäger ML. MCL: Markov cluster algorithm. 2015.Google Scholar
- Guglielmini J, de la Cruz F, Rocha EPC. Evolution of conjugation and type IV secretion systems. Mol Biol Evol. 2013;30:315–31.View ArticlePubMedGoogle Scholar
- Wickham H, Francois R. Dplyr: a grammar of data manipulation. 2015.Google Scholar
- Wickham H. Tidy data. J Stat Softw. 2014;59:1–23.View ArticleGoogle Scholar
- Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer; 2009.View ArticleGoogle Scholar
- Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11:R106.View ArticlePubMedPubMed CentralGoogle Scholar
- Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31:1674–6.View ArticlePubMedGoogle Scholar
- Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.View ArticlePubMedPubMed CentralGoogle Scholar
- Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST quality assessment tool for genome assemblies. Bioinformatics. 2013;29:1072–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.View ArticlePubMedPubMed CentralGoogle Scholar
- Ley RE, Lozupone CA, Hamady M, Knight R, Gordon JI. Worlds within worlds: evolution of the vertebrate gut microbiota. Nat Rev Microbiol. 2008;6:776–88.View ArticlePubMedPubMed CentralGoogle Scholar
- Dethlefsen L, McFall-Ngai M, Relman DA. An ecological and evolutionary perspective on human-microbe mutualism and disease. Nature. 2007;449:811–8.View ArticlePubMedGoogle Scholar
- Joint Programming Initiative on Antimicrobial Resistance. Strategic Research Agenda. Available at https://www.jpiamr.eu/document-library/strategicresearchagenda/.
- Doyle MP, Loneragan GH, Scott HM, Singer RS. Antimicrobial resistance: challenges and perspectives. Compr Rev Food Sci Food Saf. 2013;12:234–48.View ArticleGoogle Scholar
- Ashbolt NJ, Amézquita A, Backhaus T, Borriello P, Brandt KK, Collignon P, et al. Human health risk assessment (HHRA) for environmental development and transfer of antibiotic resistance. Environ Health Perspect. 2013;121:993–1001.PubMedPubMed CentralGoogle Scholar
- Baquero F, Coque TM, de la Cruz F. Ecology and evolution as targets: the need for novel eco-evo drugs and strategies to fight antibiotic resistance. Antimicrob Agents Chemother. 2011;55:3649–60.View ArticlePubMedPubMed CentralGoogle Scholar