Freshwater viral metagenome reveals novel and functional phage-borne antibiotic resistance genes

Background Antibiotic resistance developed by bacteria is a significant threat to global health. Antibiotic resistance genes (ARGs) spread across different bacterial populations through multiple dissemination routes, including horizontal gene transfer mediated by bacteriophages. ARGs carried by bacteriophages are considered especially threatening due to their prolonged persistence in the environment, fast replication rates, and ability to infect diverse bacterial hosts. Several studies employing qPCR and viral metagenomics have shown that viral fraction and viral sequence reads in clinical and environmental samples carry many ARGs. However, only a few ARGs have been found in viral contigs assembled from metagenome reads, with most of these genes lacking effective antibiotic resistance phenotypes. Owing to the wide application of viral metagenomics, nevertheless, different classes of ARGs are being continuously found in viral metagenomes acquired from diverse environments. As such, the presence and functionality of ARGs encoded by bacteriophages remain up for debate. Results We evaluated ARGs excavated from viral contigs recovered from urban surface water viral metagenome data. In virome reads and contigs, diverse ARGs, including polymyxin resistance genes, multidrug efflux proteins, and β-lactamases, were identified. In particular, when a lenient threshold of e value of ≤ 1 × e−5 and query coverage of ≥ 60% were employed in the Resfams database, the novel β-lactamases blaHRV-1 and blaHRVM-1 were found. These genes had unique sequences, forming distinct clades of class A and subclass B3 β-lactamases, respectively. Minimum inhibitory concentration analyses for E. coli strains harboring blaHRV-1 and blaHRVM-1 and catalytic kinetics of purified HRV-1 and HRVM-1 showed reduced susceptibility to penicillin, narrow- and extended-spectrum cephalosporins, and carbapenems. These genes were also found in bacterial metagenomes, indicating that they were harbored by actively infecting phages. Conclusion Our results showed that viruses in the environment carry as-yet-unreported functional ARGs, albeit in small quantities. We thereby suggest that environmental bacteriophages could be reservoirs of widely variable, unknown ARGs that could be disseminated via virus-host interactions. Video abstract.


Background
Continually emerging antibiotic resistance of pathogenic bacteria is a significant threat to global health. The World Health Organization (WHO) reported that from 2017 to 2018, approximately 1.1 million patients from 69 monitoring countries were infected with antibiotic resistant pathogens [1], and this number is expected to rise. Antibiotic resistant bacteria are a compelling health issue because they make antibiotic treatments obsolete, even leading to the death of the patient. Bacteria naturally produce antimicrobial substances to disrupt neighboring competitors, and accordingly, targeted bacteria develop antibiotic resistance genes (ARGs) to defend against such mechanisms. Excessive use of antibiotics in medical practice has been implicated to accelerate development and dissemination of antibiotic resistance among bacterial populations in clinical and natural environments. ARGs can be spread from one bacterial cell to another through horizontal gene transfer (HGT) [2]. HGT of ARGs is commonly mediated by conjugation that requires physical contact between bacterial cells via pili or adhesins [3]. Conversely, transduction of ARGs mediated by bacteriophages (phages) is rare but remains a significant dissemination route of ARGs.
Although phages are the smallest and simplest biological entities on earth, they have significant ecological roles as bacterial population controllers through lytic infections and contributors to bacterial gene diversification. During infection, phages randomly incorporate bacterial genome fragments into phage capsids or genomes as they hijack the hosts' genome replication machinery. Consequently, those genome fragments can be transferred to subsequent bacterial host genomes as phages infect other host cells. Some phages are known to have wide host range, a few with even of different bacterial orders [4], thereby functioning as a bridge of genetic exchange between different bacterial populations. Therefore, phage-associated ARGs are considered especially significant due to their possibility for wide dissemination [5,6]. Since phages are not directly influenced by antibiotics, the presence of ARGs is best evaluated by identifying nucleic acids in phage particles. ARGs conferring resistance to lactamase [7], quinolone [8], and vancomycin [9] antibiotics were detected across diverse environments based on PCR of viral DNA. However, due to the limitations of PCR to detect unreported or variances of ARGs, shotgun metagenomics that produce large amount of genomic information without a confined scope has been used to screen for ARGs.
The rapid development of viral metagenomics over the last decade has led to several viral metagenome (virome) analyses of various environments, including oceans, freshwater, and clinical samples [10][11][12][13][14], revealing a great number of novel viral sequences, including ARGs [15]. Phage-associated ARGs were found in clinical viromes [16,17] where both bacteria and phages are heavily exposed to antibiotics, and in marine and river viromes [18][19][20], indicating that ARGs are commonly carried by phages in natural environments. Although a high proportion of virome reads was found to be encoding ARGs, those genes remain uncertain whether they were collected from phage genomes or bacterial genome remnants [21]. Contrast to ARG detection frequency in raw virome reads, only a few ARGs have been discovered in the viral contigs assembled from virome reads [22]. Also, in a recent study, none of the four ARGs selected from viral contigs expressed antibiotic resistance when cloned into E. coli [21], leading to the conclusion that numbers of phage-carried ARGs might be overestimated. Nevertheless, as ARGs continue to be detected in phage isolates and viromes from diverse environments [18,[23][24][25][26], whether phages are important reservoirs of ARGs or not is still debated.
Because only a small proportion of phages are speculated to encode ARGs, phage-associated ARGs are often neglected in ARG researches, leaving them as unsupervised resistome to threat public health. Although occurrences are rare, it is evident that phages mediate ARG transduction between bacterial cells [6] and contribute to ARG dissemination [27]. Therefore, in our present study, viral metagenomes were thoroughly studied to screen for distribution of ARGs within phage populations, search for bona fide ARGs carried by phage genomes, and verify phenotypes and functionality of the phage-encoded ARGs. Using viromes constructed from urban rivers, we performed a functional assay of ARGs recovered from viral contigs and reported the first convincing evidence of two new types of β-lactamases derived from uncultured phages. Discovery of these previously unreported ARGs in the freshwater virome suggests environmental phages could be an important reservoir of unexplored ARGs.

Virome characteristics and antibiotic resistance genes in virome reads
In May 2016, six surface water samples were collected from the Han River, South Korea (Fig. S1). Viral particles obtained from 10 L samples were subjected to metagenome sequencing using the Illumina MiSeq platform, producing 3.6 to 6.6 million reads from each site (Table  S1). Among virome reads, 77.3−87.9% were annotated via the MG-RAST pipeline [28]. Only 12.3−13.8% of the annotated reads were referred to as viral reads, most of which belonged to Myoviridae, Podoviridae, and Siphoviridae of Caudovirales ( Fig. 1), all commonly found in the environment. The majority of virome reads (82.5 −84.9%) were annotated as bacterial genes that predominantly belonged to the classes Alpha-, Beta-, and Gammaproteobacteria. Since only 0.00007−0.0003% of the virome reads were found to be 16S rRNA gene sequences and proportions of single-copy bacterial marker genes were also less than 0.016% (Table S2), bacterial contamination in the virome data was negligible. Functionally annotated reads based on the SEED subsystems protein database [29] were generally annotated as phageand prophage-related functional genes (55.6−66.7%), including terminases, capsid proteins, tail proteins, and phage DNA polymerases (Fig. S2). Of the virome reads annotated as phage-related genes, 38.6−45.0% were taxonomically assigned to bacterial genes, largely derived from Proteobacteria (Fig. S3). Along with phage-and prophage-related functions, virome reads were also annotated as diverse auxiliary metabolic genes (AMGs) [30], such as enzymes for carbohydrate metabolism, respiration, photosynthesis, virulence, and defenses against foreign materials (Fig. S2).
ARGs in the virome reads included genes for lactamases, multidrug transport proteins, polymyxin resistance proteins, and vancomycin resistance proteins ( Table 1). Polymyxin resistance proteins were the most frequently annotated ARG, including ArnA, ArnC, and ArnT, which are involved in the lipid A-Ara4N (4amino-4-deoxy-L-arabinose) pathway that modifies bacterial lipid cell walls to prevent the binding of cationic antibiotics [31]. Multidrug transport proteins were largely composed of ABC-type multidrug efflux pump components. Both types of ARGs, cell wall modification and multidrug efflux pump, are relatively frequently found ARGs in diverse viromes [19]. Genes for lactamases and vancomycin resistance proteins were also detected in the Han River virome at lower frequencies than other ARGs detected. Collectively, the Han River viral communities were shown to harbor ARGs at proportions of approximately 0.1%. However, the taxonomy of ARG-carrier organisms was inconclusive due to their short sequence lengths (300 bp). Short fragments of virome reads were also insufficient to anticipate whether phage-associated ARGs were incorporated into phage capsids through generalized transduction or into phage genomes through specialized transduction. Taxonomic assignment of ARG-harboring sequences was improved Fig. 1 Taxonomic distribution of viral metagenome reads collected from six study sites on the Han River in South Korea. Viral taxonomy distribution is described at the family level, and bacterial taxonomy distribution is described at the class level by assembling raw virome reads into contigs prior to further searches.

Antibiotic resistance genes in assembled viral contigs
From the six virome datasets of the Han River, raw virome reads were assembled into contigs (Materials and Methods; Table S1). Contigs carrying at least one phage-related gene were selected so that they can confidently be considered as phages genomes, resulting in 5295 viral contigs. Predicted open reading frames (ORFs) of the viral contigs were used to search for ARGs in the ARGspecific databases CARD (Comprehensive Antibiotic Resistance Database) [32], Resfams [33], and MEGARes [34], and a total of 25 ARGs were found therein ( Table  2). ARG-harboring virome contigs were considered to be of viral origin (Figs. S4-S7); however, the contig sequences were not matched to any known phage genomes.
Although all 25 phage-associated ARGs were selected by threshold (see Materials and Methods), it was difficult to determine the ARG functionality based solely on these sequences because most genes lacked conserved active sites or signature secondary structures and had low sequence similarities to known ARGs. The ORFs H6-C148-ORF17 and H7-C136-ORF18 encoded the vatA gene, and the ORF N1-C442-ORF2 encoded the vatB gene, the virginiamycin O-acetyltranferase enzymes  The % identity, e value, and query coverage of β-lactamases have been retrieved from BLASTp against NCBI nr database ( Table 2). The vat proteins share seven well conserved antibiotic interaction sites [35]; however, the vatA gene encoded by the virome contigs did not harbor any of the conserved sequences. The vatB gene encoded by the ORF N1-C442-ORF2 appeared to carry only four out of seven conserved residues (Fig. S8) [35], and the functionalities of VatA and VatB in the virome contigs were not able to be verified. Acetyl-CoA-dependent N-acetyltransferases (AAC(6')) were found in 11 viral contigs ( Table 2). The AAC(6') in the GCN5-related N-acetyltransferase superfamily [36] possesses a signature secondary structure essential for acetylation of aminoglycoside antibiotics [37]. The AAC(6') encoding genes found in 11 virome contigs had incomplete secondary structures (Fig. S9), and these enzymes were predicted to be functionally impaired. The virome contig, H1-C267, was found to encode the qnrA3 gene that confers resistance to quinolone antibiotics. Sequences of QnrA family proteins have only a few amino acid differences among them [38], while the qnrA3 sequence found in the viral contig had only 28.8% sequence similarity to qnrA3 (AAZ04782.1). Additionally, it was also difficult to confirm whether this gene confers antibiotic resistance without phenotypic characterization. Some ARGs confer antibiotic resistance through antibiotic target protein alteration, functions of which can be utilized for different purposes in bacteriophages, such as machinery for replication or infection. The dihydrofolate reductase (DHFR), dfrB2, was detected in four viral contigs ( Table 2). The DHFR enzymes, targeted by trimethoprim, confer antibiotic resistance by providing extra DHFR gene cassette, including the dfrB2 genes [39,40]. Two virome contigs were found to harbor vanY genes (Table 2) encoding the D-alanyl-D-alanine carboxypeptidase that cleaves D-Ala from peptidoglycan precursors to prevent the binding of vancomycin [41] and also known to function as a bacteriophage endolysin [42]. Thereby, antibiotic resistance function of dfrB2 and vanY genes in bacteriophages could not be definitively determined. On the contrary, β-lactamases encoded by four virome contigs ( Table 2) had clear indication of antibiotic resistance function due to well-conserved active sites (presented below). We thereby focused our detailed functional analysis on these four β-lactamase genes.
In addition, bacterial metagenome datasets were generated in parallel at identical sampling stations. Homologous sequences to HRV-1 and HRVM-1 were searched within the metagenome-assembled contigs to observe their occurrence in bacterial fractions. Three metagenomic ORFs, each with 100% amino acid sequence Fig. 4 Maximum-likelihood phylogenetic tree of HRVM-1 with representative enzymes of subclasses B1, B2, and B3. Numbers at the nodes represent bootstrap values based on 100 re-samplings, and values higher than 80% are shown. Metagenome sequences used for the tree construction, marked with blue circles, were selected from the Han River bacterial metagenome. Accession numbers for the representative sequence types of β-lactamases used for the phylogenetic analyses are listed in Table S3 identity, were found for both HRV-1 and HRVM-1, (Figs. 3 and 4; Tables S6 and S7). Although these six contigs were discovered within bacterial metagenome sequences, they were predicted to be viral genomes (scores 0.954-0.999 from DeepVirFinder [48]). Furthermore, the metagenome contigs with these ORFs showed high synteny to the viral contigs containing bla HRV-1 and bla HRVM-1 (Fig. 6), suggesting the presence of infectious phages or prophages carrying bla HRV-1 and bla HRVM-1 in the Han River bacterial communities [49]. Public viral and bacterial metagenomes were additionally searched to assess the global distribution of HRV-1 and HRVM-1. Sequences similar to HRV-1 (amino acid sequence similarity up to 69%) were frequently found in metagenomes prepared from the Colombia River estuary in the USA, while HRVM-1-like sequences (amino acid sequence similarity up to 71%) were mainly found in freshwater viromes from Singapore (Tables S6 and S7 and Fig. S12), suggesting that HRV-1 and HRVM-1 are globally distributed. In particular, bla HRVM-1 showed a sequence similarity of4 0% to metallo-hydrolases found in Clostridium botulinum and Clostridioides difficile (Table S7), imposing that HRVM-1 and its homologous sequences are possibly carried by phages that infect these pathogens.

Discussion
Diverse ARGs have been detected in phage DNA fractions or virome reads recovered from various environments including freshwater [50], wastewater treatment plants [8], ocean [18], and animal feces [26]. Despite continued reports of phage-associated ARGs, only few of those genes have been evaluated for antibiotic resistance capacity, and of these, most were shown to be nonfunctional [21]. Therefore, whether ARGs recovered from different viromes are truly functional or not has been continuously questioned. In this study, we conducted an in-depth study on ARGs excavated from urban river viromes and discovered phage-associated ARGs, including two novel and functional β-lactamases, providing the solid evidence that viromes encompass genuine functional ARGs. This new finding alerts us of the existence of numerous unexplored functional ARGs in environmental viral populations that have thus far been overlooked due to an absence of functionality validation.
Although the Han River viromes were specifically targeted for phage particles, most of the reads were predicted to be bacterial (Fig. 1). The annotation of virome reads as bacterial genes is a common phenomenon due to dearth of viral gene database as most of phages in the environment are not yet cultured. Hence often, virome reads encoding bona fide viral proteins are improperly represented as bacterial genes. As infecting uncultured phages or prophages get sequenced along with their host bacterial genomes, their genes can be reported as phage functional gene with bacterial taxonomy, also leading to misinterpretation of virome reads. Although most of the Han River virome reads were taxonomically predicted to be bacterial sequences, approximately 62% of them were functionally annotated to encode phage-related proteins (Fig. S2). Therefore, we can speculate that the Han River virome reads that were assigned to bacterial taxonomy are in fact, viral sequences. In addition, bacterial contamination evaluation revealed that the virome data harbored very low percentage of bacterial marker genes such as SSU and LSU rRNA genes; hence, the viromes were unlikely to be contaminated with bacterial cells (Table S2). Therefore, virome reads predicted to have bacterial function, such as ARGs, were considered as viral origin.
ARGs developed in bacterial cells as a defense mechanism against antibiotic-producing microorganisms or antibiotics and phages have coincidently acquired these genes through sporadic recombination with host genomes during infection [23]. Therefore, only a small number of ARGs are expected to be found in viral sequences. Within the six viromes generated in our present study, 0.07-0.12% of the reads were predicted to be ARG sequences ( Table 1). The proportions of ARGs in the viromes were comparable with ARG levels detected from other study sites. River viromes collected Fig. 6 Genomic maps of the Han River bacterial metagenome contigs that harbor homologous ORFs to HRV-1 or HRVM-1. Red, HRV-1 or HRVM-1; blue, ORFs with virus-related functions; green, ORFs with general metabolic function; grey, hypothetical proteins. The boxed areas show a simplified depiction of the sequence synteny between viral and bacterial metagenome contigs that carry HRV-1 or HRVM-1 from the Minnesota, USA, appeared to contain approximately 0.13% of the virome reads as ARGs when analyzed using the MG-RAST annotation pipeline [50]. When multiple freshwater virome reads were evaluated for ARGs using the ARG-specific databases PATRIC and Resfams, only about 0.001% of the reads were predicted to encode ARGs [24]. In other environmental viromes, such as those prepared from soil or ocean samples, a more variable but still low proportion of virome reads were annotated as ARGs, ranging from 0.001 to 0.440% [18,24,50]. As previously claimed [21], phages, as well as viromes, are likely to carry and encode ARGs only at rare events.
The Han river flows from more pristine area (N1 and N3) to an urbanized area (H1, H4, H6, and H7) and as anthropogenic influences on the river increased, ARG levels in viral populations were expected to increase concomitantly. However, we observed no significant differences in ARG levels among the six samples from varying areas (Table 1). Similar trend was observed in the Lambro River, Italy [19]. Virome samples collected from a pristine area on the Lambro River appeared to have the highest level of ARGs (1.92%), while that collected from an urbanized site showed the lowest level of ARGs (0.48%). ARGs have been also continuously detected at low frequency (0.08-0.22%) in marine viromes generated from open oceans, where anthropogenic influences are at its minimum [18]. These findings suggest that phageassociated ARGs are a rare natural phenomenon rather than human-activity attributed event.
When ARGs are horizontally transferred through phages, they can be either encapsulated in phage particles as gene fragments or incorporated into phage genomes. Although ARGs within phage particles can be fully detected through screening of virome reads, whether these ARGs are encoded by phage genomes or not is difficult to determine. Therefore, the screening of ARGs in long viral contigs, which resemble putative phage genomes, was necessary to search for ARGs specifically encoded by phages. Among 5295 viral contigs constructed in this study, only 25 contigs were found to encode ARGs ( Table 2). As seen in ARG screening analyses of the raw virome reads, only a small number of viral genomes were shown to carry ARGs. Enault and colleagues [21] have reported that phage-associated ARGs are highly rare, and even if detected, the ARGs are unlikely to confer antibiotic resistance, concluding that most of the ARGs observed in virome reads might be caused by bacterial gene contamination. For the Han River virome data, the bacterial contamination was negligible in virome reads-level (Table S2). In addition, virome contigs used for the ARG analyses in this study were selected based on the presence of virus-specific genes and characteristics. We can thereby confidently claim that the small number of ARGs detected in this study is of viral origin without bacterial contamination.
We found four viral ORFs encoding β-lactamases that confer antibiotic resistance against β-lactam antibiotics, including carbapenems (Figs. 2 and 5). These four βlactamase genes were identified by the Resfams database but not by the CARD or MEGARes databases. The Resfams database includes most of the representative ARGs and especially focuses on β-lactamase variants [33]. Therefore, the combined use of the Resfams, CARD, and MEGARes databases to screen for ARGs allowed us to detect diversified and novel β-lactamase genes from viral contigs. Compared to other previous researches to screen for ARGs within virome data [15,21,22,51,52], we have employed lenient cutoffs (e value of ≤ 1 × e −5 and query coverage of ≥ 60% or gathering cutoff) in refining ARG search results. When conservative thresholds that have been used in the previous researches were applied (amino acid sequence identity ≥ 80% and query coverage ≥ 85%), no ARG was detected within the Han River virome contigs (Table S8). With relaxed thresholds used in this study, we discovered multiple virome ORFs showing homology to ARGs with conserved active sites including two novel β-lactamases.
To the best of our knowledge, the two novel β-lactamase genes discovered in the present study, bla HRV-1 and bla HRVM-1 , are the first ARGs discovered from viral metagenome contigs that confer true antibiotic resistance. β-Lactamases are bacterial enzymes that hydrolyze the β-lactam ring of β-lactam antibiotics and have been implicated in the emergence of antibiotic-resistant pathogenic bacteria [53,54]. Novel β-lactamase genes are continuously being found in clinical and environmental bacterial populations [55][56][57], threatening patients with potentially untreatable pathogenic infections. The presence of these two novel β-lactamase genes in freshwater viromes suggests another threat to public health. Phages are small, abundant, ubiquitous, diverse, and polyvalent, and are difficult to track, making it challenging to predict the dissemination routes of these phage-borne ARGs. Phage-associated ARGs pose a significant threat also because phage capsids are more persistent against wastewater treatments [58], allowing safe transportation into bacterial hosts. The presence of two novel β-lactamases in the freshwater virome indicates that phages are ARG carriers along with bacterial cells, and that phage populations could be significant ARG reservoirs. Vast amounts of virome data that were bypassed, or evaluated for encompassing ARG sequences only, might contain a broad range of ARGs and should be revisited for further evaluations.
Metagenome-assembled genomes (MAGs) of bacteria and viruses have contributed to expanding our knowledge of microbial populations that have yet to be cultured [59]. Genes and genomes predicted from metagenome data often have highly variable sequences with low sequence similarity to previously identified reference sequences. A variety of novel ARGs with distinctive sequences have been discovered from bacterial metagenomes [60]. The novel β-lactamase genes, bla RSA , bla RSC , and bla RSD , were all recovered from soil bacterial metagenomes [61]. Especially, the genes bla RSD1 and bla RSD3 had significantly low sequence similarities (≤ 45%) to previously reported β-lactamases, raising questions about their functionality, but were ultimately shown to confer antibiotic resistance with high MICs. Novel quinolone resistance genes (qnr) recovered from Tara ocean metagenomes [62] also had low sequence similarity (≤ 37%) to their best matches, but most of them (24 out of 27) showed increased MICs of ciprofloxacin. In the present study, we also discovered novel ARGs from viral metagenomes with high sequence variability. In addition to the four ORFs encoding bla HRV-1 and bla HRVM-1 , we also found 21 ORFs encoding other ARGs, such as vatA, aac, qnr, dfrB, and vanY ( Table 2). Since requisite active sites or core secondary structures of these enzymes were not predictable by sequence alignment analysis only, the ORFs were excluded from functional analyses in this study. However, as shown by the sequences of bla HRV-1 and bla HRVM-1 which showed antibiotic resistance even with low sequence similarity to the best matches (37%), the ARGs predicted by 21 ORFs may also confer antibiotic resistance when their functions are experimentally tested. Therefore, all ARGs predicted from this study require further functional studies to determine their resistance capacities.
Phages are generally known to have small and highly diversified genomes caused by fast recombination and mutation events [63,64]. After frequent exposure to genetic alterations such as HGTs, genes essential and beneficial to phage particles are likely to be maintained in phage genomes. The β-lactamases HRV-1 and HRVM-1 that we discovered are not considered as temporarily carried genes in phage capsids after general transduction but stably incorporated transduced genes through specialized transduction. Observation of HRV-1 and HRVM-1 sequences in different virome data indicated that these genes could be carried across and maintained within phage populations. The findings of βlactamases in the phage genomes suggest that these genes may be utilized as AMGs during phage infection. AMGs are bacterial metabolic genes carried by phage genomes and expressed during infection to the benefit of both phages and hosts [30]. One of the most widely known phage AMGs is the psaA gene of photosystems I and II, which is expressed during infection to complement the hosts' photosystem and enhance host cell metabolism to increase phage genome replication [65].
Likewise, the ARGs, HRV-1 and HRVM-1, could be carried by short phage genomes to be utilized as AMGs to contribute to host cell survival under antibiotic stress.
All 25 virome contigs harboring ARGs were predicted to be of viral origin, but neither viral nor host taxonomy of these contigs could be determined due to the uniqueness of the sequences. Most of the viral contigs that we discovered from the virome data are thus considered to originate from the genomes of uncultured and undiscovered phages. The HRVM-1 protein sequence showed homology with other metallo-hydrolases encoded by the human pathogens Clostridium botulinum and Clostridioides difficile. Although taxonomic and host information for the HRVM-1-harboring viral contig is not definable, the potentials for HRVM-1 to be transferred between pathogens and even cause antibiotic-resistant pathogenic infection are not negligible. This discovery of two new functional and novel β-lactamases, HRV-1 and HRVM-1, from the freshwater virome suggests the presence of an unknown environmental bacteriophage community that could pose an additional threat to antibiotic treatments.

Conclusion
This study was performed in natural environments where anthropogenic influences were minimal or moderate and nonetheless discovered novel β-lactamases that confer resistance against diverse β-lactam antibiotics. More ARGs with higher sequence diversity are expected to be found in environmental regimes subject to greater antibiotic stress. The discovery of HRV-1 and HRVM-1 from virome data shows that despite being rarely encountered, phage-encoded ARGs can confer antibiotic resistance and possess unique sequences that could be introduced into host bacterial cells. Our results suggest that viromes are important reservoirs of ARGs and hitherto unrecognized β-lactamases and show how phages contribute to the dissemination of antibiotic resistance and present a potential threat to human health. Therefore, we recommend that the vast repository of virome sequences accumulated from diverse environments be studied more extensively for the excavation and identification of functional and novel ARGs that may have been overlooked in the past.

Sequencing of the Han River viral metagenomes
In May 2016, 10 L of surface water was collected from each of six stations on the Han River, South Korea (Fig.  S1). The samples were filtered through a GF/A glass microfiber filter (Whatman, Maidstone, UK) followed by a 0.2-μm Supor® PES Membrane filter (Pall Corporation, New York, New York, US) to remove prokaryotic cells. The viral particles in the filtrates were flocculated by adding 0.01 g of FeCl 3 ·6H 2 O per 10 L of filtered sample, incubated at room temperature for 12 h, and collected on 0.8-μm Isopore polycarbonate filters (Merck Millipore, Burlington, MA, USA) [66]. The filters were dissolved in 0.1 M EDTA-0.2 M MgCl 2 -0.2 M ascorbate buffer and treated with DNase I and RNase A at final concentrations of 10 U/mL and 1 U/mL (Sigma-Aldrich, St. Louis, MO, USA), respectively, to remove any external nucleic acids [67]. The concentrated viral particles were purified through CsCl step-gradient ultracentrifugation [68] and subjected to viral DNA extraction using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany). The DNA samples were used for TruSeq library construction and sequenced using the Illumina MiSeq platform, with 2 × 300-bp paired-end reads (ChunLab Inc., Seoul, Korea). The raw sequencing reads were screened using Trimmomatic [69] based on quality score and length (LEADING:10 TRAILING:10 SLI-DINGWINDOW:4:16 MINLEN:100). The virome reads were submitted to the MG-RAST metagenome analysis server (www.metagenomics.anl.gov) for analysis [28]. Taxonomic and functional annotations of virome reads were performed with the MG-RAST analysis pipeline using a non-redundant M5nr database [70], with a threshold of e value ≤ 1 × e− 5 and percent identity ≥ 60% over a minimum sequence length of 15 amino acids. The presence of bacterial contamination in the virome samples was evaluated by calculating the proportion of bacterial 16S rRNA genes and single-copy bacterial genes using the METAXA [71] and ViromeQC programs (-w environmental) [72]. All virome reads were assembled into contigs using SPAdes version 3.8.2 [73] with the metaspades.py option (-k 27,47,67,87,107,127). Among the assembled contigs, only those longer than 10 kbp and those containing viral protein-coding sequences (CDS) were used for further analyses (Table S1).

Antibiotic resistance gene search and sequence analysis
The finalized contigs were predicted to be of viral (herein refers to phage or archaeal virus) or bacterial origin based on the presence of viral hallmark genes or high proportion of hypothetical genes, which are the typical characteristics of environmental viral contigs, using the VirSorter program with the "Virome" database and "virome decontamination" option [74]. All virome contigs classified as VirSorter categories 1 ("most confident"), 2 ("likely"), and 3 ("possible" phage genomes) were accepted for further analyses. The CDS of virome ORFs were predicted using the VirSorter program, and those were used as queries to search for ARGs using BLASTp [75] against the CARD or MEGA-Res with thresholds of e value ≤ 1 × e −5 and query coverage ≥ 60% and hmmscan [76] against the Resfams ARG database [33], with a cutoff of e value ≤ 1 × e −5 and a gathering threshold score ≥ 40. The results were manually curated to remove ARGs that confer antibiotic resistance by single nucleotide mutations. Curated searches revealed 25 CDSs encoding an ARG. Multiple sequence alignments of the CDSs were performed using Clustal Omega [77] with respective representative ARGs. Maximum-likelihood phylogenic trees with JTT-model were generated with MEGA 7 [78], and the tree topologies were evaluated by bootstrap analyses based on 100 re-samplings.

Expression and purification of recombinant HRV-1 and HRVM-1
The bla HRV-1 and bla HRVM-1 genes were expressed by the addition of IPTG, and their proteins were purified by running the cell lysate through a His-Bind column (Novagen), and the His 6 tag was then removed by the addition of enterokinase according to the manufacturer's instructions (Novagen). The mixture was re-purified using Fast Desalting and Mono S columns (Amersham Biosciences, Little Chalfont, UK). Protein purity was examined by SDS-PAGE. Soluble forms of homogeneous HRV-1 and HRVM-1 were obtained with yields of 4.4 and 3.4 mg per liter of culture, and their molecular weights were estimated to be 26.2 and 27.6 kDa, respectively.

Analysis of HRV-1 and HRVM-1 genes in bacterial metagenomes
A bacterial metagenome concurrent with the viral metagenome was constructed with the same water samples used for virome generation. The water samples were pre-filtered through a 10 μm pore-sized nylon membrane (Merck Millipore) followed by the collection of microbial biomasses onto a 0.2μm pore-sized mixed cellulose ester membrane (Advantec, Taipei, Taiwan). Metagenomic DNA was extracted from the membranes using the DNeasy PowerWater DNA Isolation Kit (Qiagen). Shotgun metagenome sequencing was performed using the Illumina HiSeq 4000 platform (ChunLab Inc.) which yielded approximately 21 Gb of 2 × 150-bp paired-end reads from each sample, followed by sequence quality control by FaQCs [81]. The metagenome assembly was performed using the IDBA-UD v1.1 [82]. CDSs in the metagenome contigs were predicted by prodigal v2.6 [83] with the -p meta option. A BLASTp search was performed against the database constructed using the predicted CDSs of the bacterial metagenome contigs (cutoffs of sequence identity ≥ 50%, reference coverage ≥ 80%, and e value ≤ 1 × e −4 ) to search for genes related to HRV-1 and HRVM-1 in the bacterial metagenomes. HRV-1 and HRVM-1-like genes were further searched in metagenome databases using the HMMER search engine (www.ebi.ac.uk/ metagenomics/) with an e-value cutoff of ≤ 1 × e −30 . The bacterial metagenome contigs found to be carrying either HRV-1 or HRVM-1 sequences were determined whether virus-related using the DeepVirFinder program [48] with default parameters.