Establishing clean surface-decontamination procedures with mock contaminants
In the field, no special procedures were used to avoid microbial contamination during ice core drilling, handling, and transport. Therefore, ice core surfaces likely contained microbial contaminants that impeded the identification of microbial communities archived in the ice [52, 55]. To develop a clean surface-decontamination procedure for removing possible microbial contaminants on the ice core surfaces and for collecting clean ice for microbial investigations, we constructed sterile artificial ice core sections and covered them with a known bacterium (Cellulophaga baltica strain 18, CBA 18), a known virus (Pseudoalteromonas phage PSA-HP1), and free DNA (from lambda phage), according to established protocols [52] (see “Materials and methods” and Fig. 1a). The decontamination procedure involved three sequential steps to remove a total of ~1.5 cm of the core radius, and the decontamination efficiency was evaluated (see “Materials and methods” and Fig. 1a).
The bacterial and viral contamination in each sample was quantified using strain-specific primers and qPCR (see “Materials and methods”). The contaminant bacteria and viruses were reduced by several orders of magnitude to background levels (Fig. 1b), after being processed with the surface-decontamination procedures described above (Fig. 1a and Additional file 2: Fig. S1). Even with extremely sensitive method (nested PCR), contaminant lambda phage DNA was not detected in the resulting inner ice (Fig. 1c). These results indicate that the decontamination procedure removed contaminants such as bacteria, viruses, and free DNA from the surface ice and left clean inner ice that was free of detectable contaminants for microbial and viral analysis. Earlier studies [51,52,53,54] have put foundational efforts to establish clean ice methods to decontaminate microbes; here, we constructed different decontamination systems (e.g., different washing facilities with three sequential steps; Additional file 2: Fig. S1) and expanded the clean procedures to also decontaminate viruses from glacier ice core surfaces.
Decontamination method provides clean ice from glacier core sections
After we established that the surface-decontamination procedure removed surface contaminants, we then used authentic ice core sections to further evaluate the procedure. Two sections (samples D13.3 and D13.5, from 13.34 to 13.50 and 13.50 to 13.67 m depth, respectively) obtained from a plateau shallow ice core (PS ice core) drilled in 1992 from the plateau of the Guliya ice cap (Fig. 2a, b, c) were decontaminated using the procedures described above (Fig. 1a). The ice removed during saw cutting and water washing steps (cut: saw-scraped ice; wash: H2O-washed ice), along with the inner ice (inner) for each section, was collected as described above (Fig. 1a). Microbial profiles of six samples (three samples—cut, wash, and inner—from each of the two ice sections) were examined using Illumina Miseq 16S rRNA gene amplicon sequencing.
The 30 most abundant bacterial genera, each accounting for ≥0.5% of the sequences in at least one sample, comprised 94.7% of the total 72,000 sequences in the six samples (12,000 sequences each sample). These groups were designed as “major genera” and were selected to compare the microbial communities of all cut, wash, and inner samples for both ice sections (Additional file 2: Fig. S2A). Within each ice section, the most abundant genera were shared across the cut, wash, and inner samples (Additional file 2: Fig. S2A). For example, the 11 most abundant genera (i.e., an unclassified genus within Microbacteriaceae, an unclassified genus within Comamonadaceae, Flavobacterium, Hymenobacter, an unclassified genus within Sphingobacteriaceae, an unclassified genus within Sporichthyaceae, Polaromonas, an unclassified genus within Actinomycetales, Nocardioides, Janthinobacterium, and an unclassified genus within Rhizobiales; ordered by relative abundance) were represented in all three (i.e., inner, wash, and cut) D13.3 samples; these genera comprised 93.4%, 92.8%, and 89.1% of the microbial communities in the inner, wash, and cut samples, respectively (Additional file 2: Fig. S2A). In addition, results from a two-tailed paired t-test showed that the microbial communities did not change significantly across inner, wash, and cut samples of the same ice section (p values were 0.70–0.96 for all pairs of samples, i.e., cut versus wash, cut versus inner, and wash versus inner of each section). To further evaluate these results, we next compared the microbial communities at species level using the most abundant OTUs (n = 33), each of which accounted for ≥1.0% of the sequences in at least one sample. The summed relative abundance of these OTUs ranged from 71.6 to 78.6% in these samples (Additional file 1: Table S1). Similar to the comparisons at genus level, the inner, wash, and cut samples of the same ice section shared most of the top abundant OTUs (Additional file 2: Fig. S2b). Specially, 29 of 31 and 29 of 32 OTUs were shared between the inner and the other two removed ice samples (i.e., cut and wash) for the D13.3 and D13.5, respectively. These comparisons at both genus and species levels suggest that the contaminants on the ice core surface were not abundant and diverse enough to alter the overall microbial community composition of glacier ice based on the most abundant microbial groups in these ice core sections. Notably, the PS ice core was drilled in 1992 using an electromechanical drill with no drilling fluid [56]; in general, the surfaces of these cores are less contaminated than ice cores extracted using a fluid in the borehole [55].
Several OTUs were unique in the removed samples, including one OTU belonging to the genus Acinetobacter for sample D13.3, as well as two OTUs within the genus Hymenobacter and one unclassified bacterial OTU for sample D13.5 (Additional file 2: Table S1). We posit that these OTUs (<1.0%) might be contaminants removed from the ice core surface. We also note that there may also be natural variations in microbial communities across the same cross section of an ice core (here they were represented by cut, wash, and inner samples from the same depth), as uneven horizontal distribution of dust, nutrients, and microbes in an ice core is not unexpected and may reflect variation in deposition.
Microbial profiles potentially differ between the PS and S3 ice cores
Once a clean decontamination procedure was established with both artificial ice cores and authentic ice core sections, we investigated the microbial and viral communities of two ice cores from Guliya ice cap (Fig. 2a, b, c, d). We first focused on microbial communities from five different depths (i.e., 13.3, 13.5, 24.1, 33.3, and 34.4 m) in the 1992 PS ice core, and compared them with the communities of three samples (i.e., D25, D41, and D49) from the 2015 summit core 3 (S3) (Fig. 2a, b, c, d). These three S3 samples were processed at the same time, and the 16S rRNA gene data for two (i.e., D41 and D49) of them were published previously to establish in silico decontamination method [17] and were cited in this study for comparison of microbial communities across eight depths of two ice cores from the same glacier. Four background controls were co-processed with the glacier ice samples to trace the background microbial profiles, which were then proportionally removed in silico from amplicon data of the ice core samples (see “Materials and methods”), according to our previously published method [17].
After in silico decontamination, we compared the microbial community composition at genus level between and within ice cores. Reads were rarefied to 24,000 sequences in each sample, and collectively, the samples contained 254 bacterial genera, 118 of which were taxonomically identified to the genus level (Additional file 1: Table S2). The 26 most abundant genera, defined as those comprising at least 1.0% of sequences in at least one ice sample, represented >95.1% of each community (Fig. 3a). Bacterial genera including Janthinobacterium (relative abundance 1.0–23.8%), Polaromonas (2.6–4.1%), Flavobacterium (2.3–23.6%), and unknown genera within the families Comamonadaceae (15.5–24.3%) and Microbacteriaceae (7.1–48.5%) were abundant and present in all five PS samples (Fig. 3a). This indicates that members belonging to these lineages subsist over long periods of time in the environments before being frozen permanently, although their relative abundances vary across ice core depths (ages). These genera and families have also been reported as abundant groups in glacier ice cores by many previous studies (e.g., [4, 15, 17, 57,58,59]). The detection of bacterial sequences belonging to similar genera in ice core samples from different glaciers located around the world can be explained by the ubiquitous distribution of certain species in geographically distant environments [60]. The S3 and PS ice core samples shared some abundant genera, such as Janthinobacterium, Herminiimonas, and Flavobacterium (Fig. 3a); however, several abundant genera in the S3 samples were nearly absent in the PS samples, including Sphingomonas, Methylobacterium, and an unclassified genus in the family Methylobacteriaceae (Fig. 3a). Thus, there are potential differences in the microbial communities between the ice cores retrieved from the plateau (shallow part) and the summit of the Guliya ice cap.
We next used Principal Coordinates Analysis (PCoA) to compare microbial community compositions at OTU (~species; 97% identity) level among all eight samples and found that the communities clustered primarily by ice core (Fig. 3b), separating along the first principle coordinate (which accounted for 68.2% of community variability; the second axis accounted for 13.4%). Analysis of similarity statistics (ANOSIM) confirmed that the microbial communities of samples from the plateau core were significantly different from summit core samples (p = 0.02). Because of the differences in the elevation-relevant factors such as the wind power and temperature, the process from deposition to accumulation could be different between plateau and summit surfaces, which may further contribute to the variations in their microbial communities. In addition, all PS core samples were from the shallower part of the ice cap (top 34.5 m of the ~310-m thick ice field) [56] and were much younger than the three samples from the S3 core (~70–300 years versus ~355–14,400 years old; Additional file 1: Table S3), which were collected near the bottom of the summit ice core (~51-m length; Fig. 2). Therefore, the ice samples from the two different ice cores represent very different climate conditions at the time of deposition. This is further illustrated by variations in several environmental parameters (e.g., concentration of insoluble dust and ions such as sulfate and sodium) measured in the two ice cores (Additional file 1: Table S3). To further identify the environmental parameters potentially influencing these microbial communities, two-tailed Mantel tests were performed to examine the relationships between environmental properties (Additional file 1: Table S3) and microbial community compositions. Parameters including elevation, ice age, and concentrations of dust, chloride, sulfate, and sodium, significantly (p ≤ 0.05) correlated with microbial community compositions (Additional file 1: Table S4). This further supports above discussion that explains the potential differences between the microbial communities of the two ice cores, and is consistent with many previous reports that the microbial communities archived in glacier ice often reflect the differences in many physicochemical parameters such as dust concentration [10,11,12] and some ion concentrations [13, 14]. The significant correlations between microbial community compositions and environmental parameters of ice samples indicated that the ice core microbial communities may possibly reflect climate conditions at the time they were deposited. We note that other possibilities might also influence the microbial communities, such as the deposition-to-accumulation process as discussed above and the potential post-deposition microbial activity on glacier surfaces.
Ice-archived viruses
We focused on the virus communities in two ice samples (D25 and D49) from the S3 ice core. The samples were selected based on their difference in ice age (~355 versus ~14,400 years old), climate conditions (colder versus warmer based on the δ18O data, not shown), and dust concentrations, which are up to 10 times higher in the D49 sample (Additional file 1: Table S3). Viruses were concentrated from 0.22-μm-pore-sized filtrate, which excluded intracellular viruses including temperate viruses [61], and then treated with DNase to remove free DNA. Counts of VLPs in the two samples were below the detection limit using a wet-mount method (<106 VLPs ml−1 [62];). Thus, we applied the low-input quantitative viral metagenomic sequencing that was previously established to study seawater viral communities [46, 47, 63, 64], to the viral concentrates in our low-biomass glacier ice samples. After sequencing, quality control, and de novo assembly, we obtained 1849 contigs with a length of ≥10 kb (Additional file 1: Table S5). Overall, VirSorter predicted 43 “confident” viral contigs (≥10 kb in size and categories 1, 2, 4, or 5; Additional file 1: Table S5 [65]), which were grouped into 33 vOTUs (viral OTUs) using currently accepted cutoffs that approximate species-level taxonomy [35, 50, 66]. This is a small number of viral species compared to well-studied and relatively easy-to-process sample types (e.g., global ocean samples [35, 37, 66]), and may not represent the entirety of dsDNA viral diversity in the glacier ice environments. However, it is on par with recent reports in other more challenging systems such as soils where, for example, 1.4% of assembled contigs were predicted as “confident” viruses and 53 long (≥10 kb) viral genome fragments were recovered from eight viromes [67]. On average, 1.4% (2.2 and 0.6% for D25 and S3.49, respectively) of the quality-controlled reads were recruited to these vOTUs (Additional file 1: Table S5). Low percentage of reads recruited to predicted viral sequences is not unusual for low-input viromes, and consistent with previous studies from more diverse communities (e.g., as low as 0.98% [35, 67]).
While previous studies have detected tomato mosaic tobamovirus RNA and estimated VLP concentrations in ancient glacier ice [3, 27], this is the first report of viral genome fragments assembled de novo from such an environment. Rarefaction curves were constructed (see “Materials and methods”) and showed that both viromes approached saturation of long vOTUs (≥10 kb) at the sequencing depth used in this study (Additional file 2: Fig. S3), though we note that this analysis may underestimate the total viral diversity in these samples because (i) these rarefaction curves missed any potential virus whose genome was not extracted, sequenced, or assembled from the samples, and (ii) low-input libraries have to be PCR-amplified prior to sequencing (15 PCR cycles in this study), and this can underestimate the total diversity within a library due to PCR duplicates and skew the shape of rarefaction curves [68].
Ice viral communities consist of mostly novel genera and differ between depths
With 33 vOTUs (length ≥10 kb) obtained from the two S3 ice samples, we then evaluated how viruses in this unexplored extreme environment compared to known viruses. Because viruses lack a single, universally shared gene, taxonomies of new viruses are now commonly established using gene-sharing analysis from viral sequences [69]. In our dataset, that meant comparing shared gene sets from 33 vOTUs with genomes from 2304 known viruses in the NCBI RefSeq database (version 85; Additional file 1: Table S6) using vConTACT version 2 [69]. Such gene-sharing analyses produce viral clusters (VCs), which represent approximately genus-level taxonomic assignments [37, 69, 70]. Of the 33 vOTUs, four were clustered into four separate VCs containing RefSeq viral genomes, two formed a VC with only ice vOTUs, and the other 27 vOTUs remained isolated as singletons or outlier vOTUs (Fig. 4a; Additional file 1: Table S6). Therefore, only four vOTUs (12%) could be assigned a formal taxonomy: they belonged to four different genera in the families Siphoviridae (three genera) and Myoviridae (one genus) within the order Caudovirales (Additional file 1: Table S6). These taxonomic results indicate that glacier ice has a diversity of unique viruses, consistent with, but much higher than, other environmental studies in oceans (52% unique genera) [37] and soils (61% unique genera) [71].
We then explored the environmental distribution of these 33 glacier viruses by recruiting metagenomic reads from a range of environments including global ocean [66], Arctic sea ice and ancient permafrost brine (cryopeg) [42], soils [72, 73], lakes [74, 75], deserts [76,77,78,79], air [80, 81], cryoconite [40], and Greenland ice sheet [40] (225 metagenomes total). None of our 33 glacier vOTUs was detected in any of the tested metagenomes, indicating that the glacier ice archived unique viral communities compared to other environments, at least based on the viral populations recovered here. This may be due to the fact that the glacier viruses were “frozen” several thousands of years ago, that these ancient glacier viruses are unique from the viruses in the modern environments that have probably been evolving for a long time, or that these preserved glacier viruses were not transported from those regions where the tested metagenomes were sampled. Unfortunately, the lack of viromes from ancient glacier ice limits worldwide glacier habitat analyses. However, it is promising that the “black box” of the archived ancient virus in glacier ice can now be gradually opened as the technologies to generate and study clean and low-biomass viromics, including a modern viromic toolkit [36], are becoming available [46, 47, 63, 64].
Next, we looked more closely at the vOTU (~species) level to compare viral communities obtained from the archive of two depths of the S3 ice core. With standard read-mapping to 33 vOTUs (see “Materials and methods”), we found that the glacier ice from the two depths contained a mix of shared and depth-unique vOTUs (Fig. 4b; Additional file 1: Table S7). A mix of shared and depth-unique microbes was also observed for these samples (Fig. 3a; Additional file 1: Table S2). Previous studies have also reported different microbial community structures in ice samples collected from different depths of the same ice core, which probably reflects differences in the environmental conditions at the time the ice was deposited [11, 82]. Interestingly, three vOTUs were abundant (relative abundance >10%) among the recovered vOTUs in both depths: D49_170_39214, D49_576_17121, and D25_155_24088 (vOTU names, Fig. 4b; Additional file 1: Table S7). This suggests that these viruses may be active in these ice cores or that a large number of virus particles were initially deposited so that a sufficient amount was still intact for DNA extraction and sequencing after being frozen for potentially 15,000 years.
Glacier ice viruses are predicted to infect dominant glacier ice microbes
Microbial analysis found that both the D25 and D49 samples were dominated by the bacterial genus Methylobacterium, an unclassified genus within the family Methylobacteriaceae, and genus Sphingomonas, with relative abundances of 18.2–67.5%, 5.0–8.3%, and 1.4–75.3%, respectively (Fig. 3a). In addition, the genera Janthinobacterium (7.1%) and Herminiimonas (6.6%) were also abundant in D25, but were absent or rare (<0.01%) in D49 (Fig. 3a). All of these genera are common abundant microbial groups in glaciers [4, 15, 17, 57,58,59]. In addition, many members belonging to these genera are psychrophilic bacteria and have been revived and isolated from glacier ice, such as Sphingomonas glacialis C16y, Sphingomonas sp. V1, Methylobacterium sp. V23, Janthinobacterium svalbardensis JA-1, and Herminiimonas glaciei UMB49 [18, 57, 83,84,85]. These results indicate that the ice serves as an archive for abundant taxa that are likely equipped with genomic adaptations to cold conditions and might revive and be introduced into ecosystems after the glaciers melt in the future.
We then explored the potential impacts of viruses on these abundant microbes by linking viruses to their hosts in silico. Hosts for the 33 vOTUs were predicted using three in silico methods: similarities in viral and bacterial nucleotide sequences [37, 86], composition [87], or CRISPR spacer matches [37]. The sequence similarity method (Blastn) predicted hosts for 14 of the 33 vOTUs (Additional file 1: Table S8), whereas the sequence composition method (VirHostMatcher) linked nine vOTUs to microbial hosts (Additional file 1: Table S9; see “Materials and methods”). The CRISPR method matched hosts for two vOTUs (Additional file 1: Table S10), one of which was also linked to the same host at genus level by the sequence similarity method but none of them was matched by the sequence composition method (Additional file 1: Tables S7, S8 & S9). Although only about half (18 of 33 vOTUs) of the vOTUs were linked to a host by at least one of the three methods, these host predictions indicated that viruses in glacier ice were infectious to microbes at some time (whether before and/or after ice formation) in these extreme cold and high-elevation environments, and that they probably played an important role in modulating microbial communities.
The predicted host genera that were most abundant in the same ice cores included Methylobacterium, Sphingomonas, and Janthinobacterium (Fig. 3a; Additional file 1: Table S2). Many members of these genera are psychrophilic bacteria as mentioned above. The relative abundance of Methylobacterium-associated vOTUs was high in both D25 (67.5%) and D49 (18.2%), which was consistent with the dominance (48.2% and 44.0%, respectively) of this bacterial genus in the microbial communities of both samples (Fig. 4c, d). Similarly, Janthinobacterium-linked viruses were detected with a high relative abundance of 7.1% in the D25 sample, where microbial community was found to be dominated by the genus Janthinobacterium with 4.5% relative abundance (Fig. 4e); Sphingomonas-associated viruses represented 3.1% of communities in the D49 sample, while members of Sphingomonas accounted for 75.3% of the microbial profiles in this sample (Fig. 4f). The relatively high abundance of these genera and their associated viruses suggests that the recovered viruses infected abundant microbial groups and thus might play a major role in this extreme ecosystem by influencing their hosts when they are active, although it is still uncertain when the infections occurred. Notably, no host could be predicted for about half of the vOTUs, partly due to the limitations of available reference databases and techniques used for host prediction [86]. As methods improve and host databases expand (e.g., Genome Taxonomy Database [88] and metagenome-assembled genomes from glacier ice), continued studies will likely provide more complete understanding of the relationship between viruses and their microbial hosts in the ice cores.
Temperate viruses likely dominate glacier ice environment
Having investigated virus-host pairs, we then explored the lifestyle (i.e., temperate or virulent) of the 33 vOTUs we were able to recover here. Interestingly, 14 (42.4%) vOTUs were identified as putative temperate viruses (see “Materials and methods”; Additional file 1: Table S11). Though a small dataset, the percentage of identifiably temperate phages in glacier ice was 3.2-, 8.4-, and 14.1-fold more than that in gut (13% [89]), soil (5% [67, 71]), and marine (3% [66]) viruses, respectively, detected by the same method. Several specificities of glacier ice habitats may explain such high percentage of temperate phages. Glacier ice is an extreme habitat for microbes and viruses with low temperature, high UV, and low nutrient concentration, in which microbes are usually under poor growth conditions, and microbial density is very low (i.e., 102–104 cells ml−1 [4]) compared to most other environments (e.g., seawater contains 104–106 cells ml−1 [7]). Previous reports highlighted how the frequency of temperate viruses is influenced by environmental conditions (reviewed in [39, 90]) and that temperate viruses tend to be more abundant compared to virulent viruses under extreme environments of low temperature [91, 92], high latitude [93], low nutrients [94], and low host concentrations [95]. We hypothesize that, as similar to other extreme and low-nutrient environments, temperate phages are selected for and favored before being frozen in glacier ice. Mechanistically, this selection process likely happened on the glacier ice surface, as microbes on the surface snow of the glacier are exposed to nutrients, light, and possible melt water when temperature is high in the summer, and they may still be active and undergo a selection progress on glacier surfaces (reviewed in [9]). This progress may lead to substantial size fluctuation of microbial populations and bottleneck events, which have been shown to favor temperate viruses [90, 96]. Overall, our data suggest that temperate phages likely dominate glacier ice environment and highlighted the importance to specifically target these viruses (e.g., intracellular viruses) in future studies of viruses archived in glacier ice.
Insights into the gene content and genome organization of viruses infecting Methylobacterium
Microbial analyses and viral host predictions found that both microbial members within the genus Methylobacterium and their associated viruses were abundant in the two studied glacier ice samples. Members of the genus Methylobacterium were reported to dominate the microbial community in ancient ice cores from many previous studies (e.g., [4, 12, 16, 25]) including several microbial investigations of the Guliya ice cap ice cores using culture-dependent methods about two decades ago [18, 23, 57], and they are widely distributed in natural environments. For example, the genus Methylobacterium contains 47 validly published isolates at the time of writing (https://www.bacterio.net/genus/methylobacterium) from environments including air, aquatic sediments, fermented products, freshwater, plants, and soil (summarized in [97]). The broad distribution indicates their ability to live in a wide range of environments. The viruses infecting Methylobacterium may also have significant ecological roles, so next we evaluated the environmental distribution of viruses infecting Methylobacterium and the genome features of Methylobacterium-linked glacier viruses and their closely related viruses from other environments.
Methylobacterium-associated viruses were obtained from environmental viromes including global oceans [35], Arctic sea ice and ancient permafrost brine (cryopeg) [42], soils [72, 73], lakes [74, 75], deserts [76,77,78,79], air [80, 81], cryoconite [40], and Greenland ice sheet [40], by the same method as for glacier-ice viruses. In addition, prophages were extracted from 131 Methylobacterium genomes from the RefSeq database (release v99). Only six Methylobacterium viruses were obtained from the environmental metagenomes, including three from global oceans [35], two from lake water [75], and one from a desert salt pan [77], while 478 prophages were detected from 127 out of 131 Methylobacterium genomes that were from diverse environments such as plant, soil, freshwater lake, drinking water, ocean water, salt lake, air, and ice (Additional file 1: Table S12).
A genome content–based network was built to evaluate the relationship of five glacier-ice viruses with 484 viruses from other environments, all predicted to infect Methylobacterium (Fig. 5a). In the network, two glacier virus (D25_155_13915 and D49_576_17121) were separate from any other viruses (i.e., they were singletons), the other three glacier viruses formed three VCs with eight prophages (i.e., VC0_0, VC8_0, and VC11_0; assessed with confidence scores by vConTACT v2 [69]). The vOTU D49_418_13568 was associated with viruses from air and drinking water (VC11_0), vOTU D49_170_39214 (VC8_0) was clustered with viruses from plants, while D25_14_65719 (VC0_0) was clustered with plant, air, and soil viruses (Fig. 5a and Additional file 1: Table S12). Notably, most of the associated prophages within the three VCs were from plant, soil, or air, which might be the habitats from which the glacier Methylobacterium hosts and viruses originated.
We next evaluated the genome content and organization of above-clustered Methylobacterium viruses using two glacier-ice viruses and six prophages that were longer than 15 kb (Fig. 5b, c). The glacier viruses shared a similar genomic content and arrangement with the prophages in the same VC, especially for the phage structure genes including the portal, capsid, tail, and baseplate genes (Fig. 5b, c). Notably, all these viruses contained two copies of Mu N genes that were located near the tail and baseplate genes (Fig. 5b, c and Additional file 1: Table S13). The N gene product (i.e., DNA circularization protein) has been reported as a multifunctional protein that is injected into the host cell along with the infecting phage DNA and is involved in tail assembly, as well as the protection and circularization of the infecting DNA [98,99,100]. Phylogenetic analysis of the 16 (two copies in each of eight viruses) N genes showed that these genes formed two clusters, and each cluster included one of the two copies of N genes from all eight Methylobacterium viruses (Additional file 2: Fig. S4). These results indicated that the two copies of N genes likely evolved independently in the same virus, though this is still unclear with the limited information presented in this study. In agreement with the genome-based network analysis, the viruses from the same VC clustered together based on either copies of the N genes (Fig. 5a; Additional file 2: Fig. S4), indicating strong conservation of N genes in the Methylobacterium viruses.
Taken together, the viruses infecting Methylobacterium appear to be abundant in the glacier ice and are related to viruses infecting Methylobacterium strains in plant and soil habitats. This is consistent with a previous report that the main source of dust deposited on Guliya ice cap likely originates from the soils [101]. This points to a potential long-standing association between phages and their host in the Methylobacterium genus, possibly over more than tens of thousands of years, and highlights how some bacteria and phages can seemingly stably coexist in the environment, as argued in other studies (e.g., [102, 103]).
Glacier ice viruses unravel novel auxiliary metabolic genes (AMGs) potentially influencing host chemotaxis
Virus-encoded auxiliary metabolic genes (AMGs) are microbial-derived genes that can modulate host metabolism during infection and have been reported in viruses from diverse ecosystems such as marine water [37], soil [67, 71], animal host (e.g., rumen [104]), and some extreme environments (e.g., Arctic cryopeg brine and sea ice [42]). Here, we begin to explore the AMGs of viruses archived in glacier ice. Briefly, 1466 predicted genes from the 33 vOTUs (length ≥ 10 kb) were queried against functional databases by DRAM-v (see “Materials and methods”), which resulted in about half genes (n = 779) matching annotated sequences in KEGG or PFAM databases (Additional file 1: Table S13). These annotations will potentially enable the datasets as valuable public resource of ancient viral genes.
Four putative AMGs were identified from these annotated genes (Additional file 1: Table S14). Two of them were previously reported, including concanavalin A-like lectin/glucanases superfamily and sulfotransferase [37, 71]. The former one was associated with virus-encoded glycoside hydrolase that was potentially involved in pectin cleavage, thus, further potentially facilitating microbial carbon degradation and utilization through cleaving polymers into monomers and influencing the carbon cycling [71]. The later one was associated with sulfation that contributes to the transfer reaction of the sulfate group from the donor (e.g., 3′-phosphoadenosine 5′-phosphosulfate) to an acceptor that can be a number of substrates, and can potentially play a key role in biological processes such as cell communication and growth [105]. The other two AMGs, motA and motB, that were potentially relevant to cell flagella assembly (Additional file 1: Table S14), were never reported previously as AMGs in viral contigs, though our screening of 848,507 viral contigs in the Global Ocean Viromes 2.0 dataset (GOV 2.0 [66]) identified motA or motB genes from 70 high-quality viral contigs in 52 viromes including 23, 15, and 14 viromes from surface water, mesopelagic water layers, and deep chlorophyll maximum layers, respectively (Additional file 1: Table S15), indicating their broad distribution in the ocean environment. These AMGs can potentially offer new insights into how viruses manipulate microbial metabolisms when they might have been active ~14,400 years ago before being frozen. Here, we further focused on the two novel AMGs and discussed how they potentially influence the metabolisms of microbial hosts in glacier ice. These two novel genes were motility genes (motA and motB) from the same vOTU D25_22_20338 (Additional file 1: Table S14; Additional file 2: Fig. S5a).
Fueled by ion flow, bacterial flagella are turned by rotary motors which consist of the stator and the rotor [106]. Analyses of AMGs in glacier-ice viruses revealed that the vOTU D25_22_20338 encoded two membrane-embedded proteins, MotA and MotB (Additional file 2: Fig. S5a), which compose the stator of a flagellar motor. In bacteria, MotA/MotB protein complexes function in delivering protons to the rotor, thus generating a proton motive force as the energy source to rotate the rotor [107]. Chemotaxis plays a central role in controlling the rotational direction of flagellar motors, which allows bacteria to respond to environmental stimuli [108]. Considering the harsh environment associated with nutrient deficiency in glacier ice [109], we speculate that viruses potentially hijacked these motility genes (i.e., motA and motB) to facilitate nutrient acquisition of their hosts.
We then explored the functionality and evolution of the two novel AMGs (i.e., motA and motB). The protein sequences of the two novel AMGs were structurally modeled using Phyre2 [110], and the results showed that both had 100% confidence scores that linked them to their closest template protein (Additional file 2: Supplementary Fig. S5bc). MotB uses a conserved peptidoglycan-binding motif to anchor the stator complex to the peptidoglycan layer around the rotor [111], and this motif was identified in the virus-encoded MotB (Additional file 2: Fig. S5e). Though MotA lacks a conserved motif (Additional file 2: Fig. S5d), it functions as a complex and is co-transcribed and translated with MotB [112]. Together, these in silico analyses suggested that these AMGs are likely functional. Evolutionarily, both AMGs were deeply isolated from all clades with their mostly close microbial homologs (Additional file 2: Supplementary Fig. S6ab). These phylogenetic results limited us to further identify potential horizontal gene transfer events of these AMGs from hosts to viruses, while they suggested that these genes found in the ancient glacier-ice viruses recovered in this study are very distinct from known microbial sequences in modern environments.
In summary, these findings about AMGs can potentially provide a glimpse into how glacier-ice viruses, in the Guliya ice cap, manipulate host metabolism and hence likely affect biogeochemical cycles when they were active before being frozen. We note that all these speculations are based on in silico analyses; future experiments are necessary to validate the activity and function of these potential virus-encoded proteins.
Many studies have demonstrated microbial activity on the glacier surfaces especially in the cryoconite holes in summer (e.g., [113, 114]), including glaciers from Tibetan plateau region [9, 82]. However, the surface activity may vary from glaciers with different location, elevation, radiation, and surface temperature. Guliya ice cap is located at middle latitude (35.25°N; 81.48°E) of Tibetan Plateau, and the summit elevation is about 6710 m above sea level. The surface temperature of the summit is below the water frozen point (0°C) during most of the time in a year around, while in the summer, the Guliya surface temperature could approach near or above 0°C for short periods and has strong sunlight input; this likely leads to produce some melt water on the glacier surface, which was supported by the observation of melt layers (i.e., clean and transparent ice) in the ice core (data not shown). Therefore, there is likely microbial activity on the surface of Guliya ice cap before microbes were “permanently” frozen. In addition to the glacier surface, some studies have hinted at the possibility of microbial activity in frozen glacier ice based on the detection of some excess gases (e.g., CO2, CH4, and N2O) at some depths, which may be produced by post-depositional microbial metabolism [24,25,26]. However, without direct observational measurements, it remains controversial in whether there is in situ microbial activity in glacier ice after being frozen. We anticipate future studies could better articulate the potential microbial activity in glacier environments including the surface and englacial ice (i.e., after being frozen). Here, we propose next-step experiments trying to explore the “activity” questions described above. Ideally in the field work, we could sample the time-series snow before deposition (i.e., from air) and after deposition (i.e., from different depths of glacier ice surface) and compare the microbial communities of matched snow samples from before and after deposition. The results from comparison will help us understand if there is activity and how communities change on the glacier surfaces. In addition, some specific microbial groups (e.g., Cyanobacteria and Chloflexia) may be used as indicator of surface growth, as they need light to grow and may “bloom” on the glacier surface [82]. In the lab, microbial activity in glacier ice could be measured using the BONCAT-FACS method [115] through comparing the potential change of microbial communities of the sample replicates after incubations under various conditions in temperatures (< 0°C) and times.