Overview of Ace Lake metagenomes and Synechococcus-like MAGs
A total of 120 time-series Ace Lake metagenomes from summer (Jan, Feb, Dec), winter (Jul, Aug) and spring (Oct, Nov), sampled from seven lake depths (surface, oxic 1, 2 and 3, interface, and anoxic 1, 2 and 3) and 11 time periods (spanning 2006, 2008 and 2013–2015), were used for analyses (Fig. 1a; Additional file 1: Table S1). For fragment recruitment (FR) analysis, 60 metagenomes (302 Gb), in which Synechococcus-like OTU relative abundance was ≥ 1% [25], were used to generate 30 merged metagenomes by pooling reads from 3-20 and 0.8–3 μm filter metagenomes that represented specific depths and time periods (Additional file 1: Table S1). For viral analysis, 39,287 Ace Lake viral contigs (724 Mb) from the Antarctic virus catalogue [25, 44] were used to identify potential viruses associated with the Synechococcus-like species.
A total of 59 high- or medium-quality MAGs generated from Ace Lake metagenomes (one MAG per metagenome) were analysed (Additional file 2: Dataset S1), of which 25 MAGs were ≥ 99% complete. IMG (Integrated Microbial Genomes) taxonomy classified all the MAGs as SynAce01. Together, the Synechococcus-like MAGs consisted of 6681 contigs that encoded 176,198 genes. For FR analysis, two Synechococcus-like MAGs were used: one from the Jul 2014, 5 m depth (oxic 1), 3–20 μm filter metagenome (MAG-AL1), and one from the Dec 2014, 14 m depth (anoxic 1), 3–20 μm filter metagenome (MAG-AL2). MAG-AL1 contained 64 contigs, 2929 genes, was 99.7% complete (2,644,322 bp) with 0.09% contamination and was selected for its high bin completeness and lowest bin contamination. MAG-AL2 contained 120 contigs, 2956 genes, was 97% complete (2,654,228 bp) with 0.63% contamination and was selected because it contained a distinct 16S rRNA gene sequence (Additional file 1: Table S2; Additional file 2: Dataset S1).
Synechococcus-like species phylotypes in Ace Lake
A total of 18 full-length (1489 bp) 16S rRNA genes were identified in Synechococcus-like MAGs; they were identical except for the MAG-AL2 gene which distinguished it as a separate phylotype from MAG-AL1 (and all other MAGs) by having two SNPs: 217 A-T and 231 G-T (Fig. 2; Additional file 1: Fig. S1a). By recruiting reads from the merged metagenomes to the two reference MAGs (MAG-AL1 and MAG-AL2), the two SNPs at positions 217 and 231 (i.e. the MAG-AL2 transversions) were determined to be present in all Ace Lake merged metagenomes (Additional file 1: Table S3).
The IMG genome of SynAce01 [30] contains two full-length 16S rRNA genes. MAG-AL1 and MAG-AL2 each contain one full-length 16S rRNA gene, and MAG-AL2 contains an additional incomplete 16S rRNA gene (see supplementary text). The ratio between 16S rRNA SNPs median read depth (from 100% identity FR) and the read depth of its respective MAG was ~ 2 (Additional file 1: Fig. S2), indicating that each MAG contained two very similar 16S rRNA genes, similar to SynAce01. By comparison, the read depth ratio for Ca. Chlorobium antarcticum, which contains one 16S rRNA gene [29], was ~ 1 (Additional file 1: Fig. S2).
The two 16S rRNAs from SynAce01 are not identical. Both genes had 217 T (as for MAG-AL2), while ‘gene 1’ had 231 G (as for MAG-AL1) and ‘gene 2’ had a base missing at position 231 (possibly an assembly error) (Fig. 2; Additional file 1: Fig. S1a). The read depths for the two SNP markers for gene 1 were different (Additional file 1: Fig. S3f), and only one read in all the metagenome data matched both SNP markers of gene 1. Due to the difficulties of isolating an axenic and non-clonal strain (see above and Ref. [23]), it is possible that the SynAce01 genome represents two or more closely related Synechococcus-like strains. In support of this, FR analyses of the original NCBI SRA SynAce01 reads to SynAce01 16S rRNA genes revealed read sequences with either 217 A plus 231 G (as for MAG-AL1) or 217 T plus 231 T (as for MAG-AL2). For all these reasons, the SynAce01 genome was not used for assessments of population variation (further discussion is provided in supplementary text).
The SNP markers (217 A-T plus 231 G-T) were used to evaluate the contributions of the MAG-AL1 and MAG-AL2 phylotypes to the total Synechococcus-like species population (Fig. 3a; Additional file 1: Table S4). The relative contribution of the two phylotypes varied with depth and season. In all metagenomes, the MAG-AL1 phylotype contributed the most to the Synechococcus-like species population, with highest representation in the oxic zone. The MAG-AL2 phylotype had highest representation (almost 50% of the total Synechococcus-like species population) in the oxic-anoxic interface or anoxic 1 depth (Fig. 3a).
The relative contributions of MAG-AL1 and MAG-AL2 were used to calculate their abundances in Ace Lake highlighting their distribution throughout the water column (Fig. 3b). The highest abundance of MAG-AL2 in each time period occurred at different depths, with it being prevalent at the oxic-anoxic interface and surrounding depths. This suggested that MAG-AL2 signatures from the anoxic depths might not be from dead cells.
In addition to the two SNPs that define MAG-AL1 and MAG-AL2, seven additional SNPs were identified in three oxic zone metagenomes (Additional file 1: Table S3). These seven SNPs might be indicative of other Synechococcus-like species phylotypes (distinct from MAG-AL1 and MAG-AL2) in the oxic zone, particularly in surface waters, where SNP frequencies were higher (21–25%) than in oxic 1 (16–19%) and oxic 2 (3–5%) (Additional file 1: Table S3).
Phylogeny and global representation of Ace Lake Synechococcus-like species
In addition to the high identity between 16S rRNA genes from MAG-AL1 and MAG-AL2 (99.9% identity), the average nucleotide identity (ANI: 99.6% over 92% alignment fraction), average amino acid identity (AAI: 99.3% over 90% alignment fraction) and digital DNA-DNA hybridisation (dDDH: 97%) were high (Additional file 1: Figs. S1 and S4), indicating that these two phylotypes belonged to the same species and subspecies.
Phylogenetic tree construction based on 16S rRNA genes and whole proteome content (Fig. 4), and ANI and AAI analyses of closely related MAGs and reference genomes obtained from GTDB (Additional file 3: Dataset S2) demonstrated that Ace Lake cyanobacterial genomes (SynAce01, MAG-AL1, MAG-AL2) formed a tight clade with ≤ 82% ANI and ≤ 79% AAI to other taxa. The MAG-AL1 and MAG-AL2 16S rRNA genes had ≤ 98% identity to 16S rRNA genes available in databases from IMG publicly assembled metagenomes and public isolates (Additional file 3: Dataset S2). The closest non-Antarctic species (≤ 98% identity) included Synechococcus sp. 1G10 (Nahuel Huapia Lake, Argentina), Synechococcus sp. MW101C3 (Lake Mondsee, Austria) and Synechococcus sp. WH5701 (Long Island Sound, New York; Ref. [45]). Each of these species, along with SynAce01 from Ace Lake, has recently been placed in the novel genus Regnicoccus [5]. The 16S rRNA genes of Synechococcus from Lake Abraxas and Pendant Lake (Vestfold Hills; Ref. [30]) were ≤ 98% identical to the MAG-AL1 and MAG-AL2 sequences. Based upon these data, the Ace Lake cyanobacterium appears to represent a distinct species that is possibly confined to this water body. This contrasts with the green sulphur bacterium, Ca. Chlorobium antarcticum, which has an identical 16S rRNA gene sequence from Ace Lake, Taynaya Bay and Ellis Fjord [29]. In view of the phylogenetic characteristics of the Ace Lake cyanobacterium, we name a new Ace Lake species: Candidatus Regnicoccus frigidus sp. nov. (from fri’gi.dus. L. masc. adj. frigidum cold, referring to the cold environment) (type MAG MAG-AL1: GenBank accession ID = JAOANE000000000; IMG bin ID = 3300023237_10; 99.7% complete; 0.09% contamination) (Additional file 2: Dataset S1).
Ca. Regnicoccus frigidus population variation
SNPs (variant frequency ≥ 0.9), indels (read depth ≥ 20) and VCRs (significance of gene coverage variation assessed using DESeq2) were identified from FR of reads that represented different lake depths and time periods to MAG-AL1 and MAG-AL2 (Fig. 5; Additional file 4: Dataset S3). Variation was lower in MAG-AL1 (75 SNPs and 17 indels from 45 genes) than MAG-AL2 (572 SNPs and 27 indels from 157 genes) (Additional file 4: Dataset S3).
SNPs and indels were identified in genes involved in cell wall or membrane biosynthesis and modification, transport, translation, carbohydrate metabolism, amino acid biosynthesis and other metabolic processes, as well as some hypothetical genes (Additional file 4: Dataset S3). Only a few SNPs and indels from each MAG were consistently represented across metagenomes from different time periods of the same depth (Additional file 4: Dataset S3; further discussed in supplementary text). This indicated that most Ca. Regnicoccus frigidus mutations were not stable, with the temporal variation observed being indicative of a relatively dynamic population. The mutations in MAG-AL1 and MAG-AL2 genes were observed mainly in the anoxic and oxic depth metagenomes, respectively (Additional file 4: Dataset S3). MAG-AL1 genes containing stable mutations included the following: a glycosyltransferase (cell wall modification) and 2-oxoisovalerate dehydrogenase (branched-chain amino acids degradation) (Additional file 4: Dataset S3). MAG-AL2 genes with stable mutations encoded the following: carboxysome shell carbonic anhydrase (carbon dioxide fixation); a vitamin K epoxide reductase family protein (post-translational modification); glycerol-3-phosphate acyltransferase (glycerolipid synthesis); N-acetylglucosamine-6-phosphate deacetylase (cell wall synthesis and glycolysis); a glycosyltransferase (cell wall modification); UDP-glucuronate decarboxylase (cell wall modification); and four hypothetical proteins (Additional file 4: Dataset S3).
Most variable coverage genes (VCGs) with significant coverage variation were of unknown function, i.e. hypothetical or uncharacterized proteins, poorly characterized or coded for mobile elements (Additional file 4: Dataset S3). The remainder were genes involved in the following: cell wall or membrane biosynthesis and modification, transport, stress response, cell defence, cyanide assimilation and other metabolic functions (Additional file 4: Dataset S3). Significant gene variations were only identified for comparisons by depth, specifically oxic vs anoxic and oxic vs oxic-anoxic interface (Fig. 6; Additional file 4: Dataset S3). Most VCGs with similar function had distinct sequences in MAG-AL1 and MAG-AL2, with the depth-dependent variation specific to each MAG: the VCGs had higher coverage in the oxic zone for MAG-AL1 and higher coverage in the oxic-anoxic interface and anoxic zone for MAG-AL2 (Figs. 5 and 6; Additional file 4: Dataset S3). This pattern of coverage matched phylotype abundance, with MAG-AL1 more prevalent in the oxic zone and MAG-AL2 more so in the oxic-anoxic interface and surrounding depths (Fig. 3b; Additional file 1: Table S4). These data would be consistent with niche adaptation, with MAG-AL1 and MAG-AL2 possessing genetic capacities tailored to growth and survival in the oxic and anoxic zones, respectively (also see below in ‘Niche adaptation in Ace Lake’).
Alignments of Ca. Regnicoccus frigidus MAGs revealed that contigs that did not align or had poor alignment tended to contain VCGs or putative viral genes (Additional file 5: Dataset S4). However, MAG-AL2 contigs 118–120 did not match any other MAGs. These contigs had low read depth in all metagenomes, and their gene relative coverages showed depth-dependent variation: oxic-anoxic interface and anoxic zone, < 24%, and oxic zone, ≤ 0.2% (Additional file 4: Dataset S3). Some of the genes on these contigs (e.g. glycine hydroxymethyltransferase, ATP adenylyltransferase, bifunctional demethylmenaquinone methyltransferase/2-methoxy-6-polyprenyl-1,4-benzoquinol methylase UbiE, murein DD-endopeptidase MepM/murein hydrolase activator NlpD, MFS family permeases, selenophosphate synthetase) were present with normal coverage elsewhere in MAG-AL2, suggesting that all Ca. Regnicoccus frigidus populations studied possessed these functional traits.
The remaining low coverage genes represented Ca. Regnicoccus frigidus populations at the oxic-anoxic interface and in the anoxic zone that possessed a genetic capacity not present in the Ca. Regnicoccus frigidus populations in the oxic zone. Two genes annotated as a carbon monoxide dehydrogenase (CODH) maturation factor and a predicted RNA-binding protein contained CooC and CooT domains, respectively; these are domains found in accessory proteins involved in the maturation of anaerobic CODH that occurs by the insertion of nickel into the active site [46, 47]. No CODH genes were identified in Ca. Regnicoccus frigidus MAGs, suggesting that the CooC and CooT domain-containing enzymes may function in anaerobic process(es) involving nickel-dependent pathways.
GC content of contigs was plotted against read depth to assess the presence of contig clusters representative of divergent (< 95% sequence similarity) Ca. Regnicoccus frigidus phylotypes (Additional file 1: Fig. S5). Taxonomic analysis of 51,971 metagenome contigs adjacent to or overlapping the Ca. Regnicoccus frigidus MAG contigs (i.e. metagenome contigs with 45–80% GC content) revealed that only 297 metagenome contigs were classified as Cyanobacteria, and many of these had ≥ 99% identity matches to assembled Ca. Regnicoccus frigidus MAGs (Additional file 1: Table S5; also see supplementary text). The analyses indicate that Ca. Regnicoccus frigidus phylotypes with a high level of divergence (< 95% sequence similarity) were not abundant and, given the large size of the Ace Lake dataset, are not typical of the lake ecosystem.
Ca. Regnicoccus frigidus viruses
Ca. Regnicoccus frigidus viral contigs were identified in several ways (Additional file 6: Dataset S5): A) 31 in IMG/VR v3 from Ace Lake metagenomes: vOTU_081954 (22), vOTU_248451 (7), Sg_292136 (1) and Sg_613705 (1). B) 22 viral contigs aligned to the 59 Ca. Regnicoccus frigidus MAGs: vOTU_022592 (16) and Sg_256402 (1) from IMG/VR v3 and cl_2442 (2), cl_463 (1), sg_14817 (1) and sg_14822 (1) from the Antarctic virus catalogue. C) 11 previously identified viral contigs based on matches to a cyanophage assembled from Ace Lake metagenome data: cl_6580 (2), cl_6727 (2), cl_9495 (1), cl_9892 (1), sg_14929 (1), sg_14949 (1), sg_14969 (1), sg_14971 (1) and sg_15003 (1) from the Antarctic virus catalogue [25]
The set of 22 and the set of 11 viral contigs contained genes that were taxonomically classified to a variety of microorganisms, indicating the viral contigs might represent generalist viruses that prey on multiple hosts. Five of the 22 viral contigs were present in two Ca. Regnicoccus frigidus MAGs, two verrucomicrobial MAGs and one actinobacterial MAG (Additional file 6: Dataset S5), indicating that these viral contigs likely represented prophages in the respective MAGs. The set of 31 viral contigs included three predicted prophages in IMG/VR v3 (Additional file 6: Dataset S5). Two of these prophages plus 10 other viral contigs aligned to some Ca. Regnicoccus frigidus MAGs (Additional file 6: Dataset S5). The prediction of prophages is consistent with two prophage regions (phiSynAce1 and phiSynAce2) reported for the SynAce01 genome [30].
Overall, the data for these three sets of viral contigs suggests that (i) vOTU_081954, vOTU_248451, Sg_292136 and Sg_613705 represent Ca. Regnicoccus frigidus specialist viruses, some of which are prophages; (ii) Sg_717548 and Sg_723842 (and nine viral contigs from the Antarctic virus catalogue) represent generalist viruses that prey on cyanobacteria; and (iii) Sg_256402 and vOTU_022592 (and five viral contigs from the Antarctic virus catalogue) likely represent generalist viruses that prey on an even broader range of hosts (Additional file 6: Dataset S5).
MAG-AL2 contained more predicted prophages than MAG-AL1, although the total viral gene composition for each MAG was similar (Additional file 6: Dataset S5). Gene coverage of the predicted MAG prophages was high (MAG-AL1 ≤ 7000 read depth; MAG-AL2 ≤ 8000) compared to MAG read depths (both MAGs < 700) (Additional file 6: Dataset S5). The set of 31 Ca. Regnicoccus frigidus viral contigs had high coverage (< 6500), some of which belonged to Ca. Regnicoccus frigidus MAGs. The high coverage contigs are likely to represent viral progeny of integrated (i.e. MAG prophage) or nonintegrated viruses associated with cells.
Searches for additional prophages in MAG-AL1 and MAG-AL2 were performed based on read coverage (i.e. high), gene composition and/or proximity to already predicted prophages. All prophages identified by this process were ≤ 19 kb in length, which is short by comparison to known cyanophages and therefore likely represent remnants of previous prophages (Additional file 6: Dataset S5). The MAG-AL2 prophage genes on contigs 17, 106 and 111–116 had very low relative coverages in surface and oxic 1 (except Nov 2008) metagenomes compared to metagenomes from deeper depths (Additional file 6: Dataset S5), possibly reflecting a greater loss of these prophage genes from the Ca. Regnicoccus frigidus population in the upper waters of the lake.
Host defence against viruses
Restriction-modification (RM) systems can be encoded by hosts and/or viruses and can impact host-virus interactions in a variety of ways [41, 48,49,50]. The prophages within MAG-AL1 and MAG-AL2 contained a type 2 RM DNA methylase, with MAG-AL2 prophages also containing two type 1 RM DNA methylases (Additional file 6: Dataset S5). Type 2 RM methyltransferases have been associated with lysogenic lifestyles [49], which would be consistent with the prophage remnants arising from an integrated temperate virus. All subunit genes of a type 1 RM system, and additional genes associated with RM systems (e.g. type 3 Res subunit domain), were also present outside of the prophages in MAG-AL1 and MAG-AL2 and were therefore host-specific RM genes (Additional file 7: Dataset S6).
CRISPR-Cas system genes were not identified in Ca. Regnicoccus frigidus MAGs, consistent with previous findings for Ace Lake Synechococcus-like OTUs [25] and other marine cyanobacteria [51]. However, systems potentially involved in host-virus interactions included the DISARM (defence island system associated with restriction-modification) and retron systems. DISARM genes identified in Ca. Regnicoccus frigidus MAGs were drmMII (DNA [cytosine-5]-methyltransferase) and drmD (SNF2 family DNA/RNA helicase), although genes constituting a complete system were not identified (Additional file 7: Dataset S6).
Retrons are often found near defence systems such as RM genes and afford viral defence via an ABI (abortive infection) mechanism and have previously been identified in cyanobacteria [52]. Bacterial retrons consist of a reverse transcriptase gene, a noncoding RNA and an effector gene which encodes a DNA-binding, HNH endonuclease, ribosyltransferase or two transmembrane (2TM) domains [52]. A reverse transcriptase gene containing a bacterial retron domain was identified close to a type 1 RM system in MAG-AL1 (Additional file 1: Fig. S6; Additional file 7: Dataset S6). Most genes near the retron homologue were hypothetical genes and did not match known retron effector domains (Additional file 1: Fig. S6), although exostosin family protein domain (TM domain) and HicB antitoxin (which contains HTH domain; Refs. [53, 54]) genes were identified adjacent to retron homologs in Ca. Regnicoccus frigidus MAGs and could possibly function as effector genes to constitute a functional retron anti-phage system.
Depth-dependent variation was observed for some of the above Ca. Regnicoccus frigidus defence genes (Additional file 7: Dataset S6). The MAG-AL1 retron homolog, drmMII, type 1 RM system (two S, one R and one M subunit) and three putative RM (two Uma2 family endonucleases and one HNH family endonuclease) genes had higher coverage in the oxic zone than in the oxic-anoxic interface or anoxic zone (Fig. 7; Additional file 4: Dataset S3; Additional file 7: Dataset S6). A similar pattern of variation occurred for MAG-AL2 drmMII, type 1 RM S subunit and putative RM (HNH family endonuclease) genes. Viruses are prevalent throughout Ace Lake, but abundance is highest in the oxic zone [25]. These aforementioned systems that are overrepresented in the oxic zone may reflect a functionality particularly suited to responding to the specific viral population. In contrast, two type 1 RM R subunit genes and a putative RM gene (Uma2 family endonuclease) that were specific to MAG-AL2 had 2 to 3 times higher coverage in the oxic-anoxic interface and anoxic zone than the oxic zone (Additional file 4: Dataset S3; Additional file 7: Dataset S6). As MAG-AL2 is prevalent in the anoxic zone, the higher coverage for these defence systems suggests they are more specific to viruses enriched in the anoxic zone (Fig. 3; Additional file 7: Dataset S6).
Bacteriophage exclusion (BREX) type 1 system genes (brxC, brxB, brxA and truncated pglX and brxL) and additional BREX genes (brxHI, brxHII, pglW) were identified in both MAG-AL1 and MAG-AL2, and pglZ and complete brxL (often together) were present in some other Ca. Regnicoccus frigidus MAGs (Additional file 1: Fig. S6; Additional file 7: Dataset S6). The pglX gene was truncated in MAG-AL1 and MAG-AL2, and some Ca. Regnicoccus frigidus MAGs contained two truncated pglX genes that together constituted the full-length gene (Additional file 1: Fig. S6; Additional file 7: Dataset S6; further discussed in supplementary text). A complete pglX cyanobacterial gene was also identified in Ace Lake contigs. Similar observations were made for Antarctic haloarchaea resident in Deep Lake (Vestfold Hills) [41]. Interruption of the pglX gene occurs in a diverse range of microorganisms with acquisition of the gene-by-gene transfer enabling the BREX system to be functional [56]. In addition to the variation in the integrity of the pglX gene, only a subset of Ca. Regnicoccus frigidus MAGs contained complete sequences of brxL and pglZ (core BREX gene) (Additional file 7: Dataset S6). Using a MAG that contained brxL and pglZ (99.7% bin completeness, 3.6% bin contamination, Dec 2014, 12 m depth, 0.8-μm filter metagenome), both genes were found to have low relative coverages (≤ 25%) in all metagenomes, with coverage significantly higher in the oxic-anoxic interface and anoxic zones than in the oxic zone (Additional file 4: Dataset S3). These data show depth-dependent variation for BREX genes, with less than a quarter of the Ca. Regnicoccus frigidus population possessing brxL and pglZ; that subpopulation would also need to possess (vertical inheritance) or acquire (gene transfer) a functional pglX gene in order to perform BREX-mediated viral resistance (Fig. 7). It therefore appears that the Ca. Regnicoccus frigidus population is limited in its ability to mount, at best, a transient BREX response.
Host evasion of viruses
Variation (VCRs, SNPs, indels) was a feature of a variety of genes encoding cell surface proteins (e.g. TolC and porins) or genes involved in cell wall biosynthesis (e.g. lipopolysaccharides) and modification (e.g. glycosyltransferases) (Additional file 4: Dataset S3). Viruses attach to cell surface components, including lipopolysaccharides, TolC and porins [57]. The coverages of these VCGs involved in cell surface structures in MAG-AL1 were significantly higher in the oxic zone than in the oxic-anoxic interface or anoxic zone, while the opposite trend occurred for MAG-AL2 (with the exception of a few glycosyltransferases). Moreover, the specific VCGs in MAG-AL1 differed to those in MAG-AL2, suggesting that the cell wall composition of the Ca. Regnicoccus frigidus represented by the two MAGs was likely to differ (Additional file 4: Dataset S3).
SNPs and indels identified in some of the Ca. Regnicoccus frigidus glycosyltransferases (Additional file 4: Dataset S3) may impact cell wall composition by influencing substrate specificity of the enzymes and the type of sugar they incorporate during glycosylation [58, 59]. Variation in glycosyltransferases and other cell surface proteins has been speculated to mediate viral evasion in Antarctic haloarchaea [41, 43] and marine Prochlorococcus [60], and liposaccharide modification has been shown to perturb viral infection of Anabaena sp. PCC7120 [61]. The types of genetic variation observed for Ca. Regnicoccus frigidus is therefore likely a response to interactions with viruses, particularly as a mechanism of evasion of specialist viruses that target specific epitopes during viral attachment (Fig. 7).
Niche adaptation in Ace Lake
Specific relationships were evident between Ca. Regnicoccus frigidus phylotype abundances and physicochemical data (Fig. 3; Additional file 1: Table S4). Significant correlations occurred between changes in MAG-AL1 abundance and depth (Spearman’s rank correlation coefficient ρ = − 0.6, P = 0.003), DO (ρ = 0.5, P = 0.008) and salinity (ρ = − 0.6, P = 0.003), but not lake temperature (ρ = − 0.4, P = 0.1). In contrast, no significant correlations occurred with MAG-AL2 abundance. Of these lake factors, salinity has previously been associated with the evolution of Synechococcus and Prochlorococcus ecotypes in the South China Sea [17].
Significant depth-dependent variation in MAG-AL1 and MAG-AL2 gene coverages was observed for oxic vs anoxic and oxic vs oxic-anoxic interface metagenomes (Fig. 6; Additional file 4: Dataset S3). The functional properties of the VCGs encoding metabolic functions were examined to assess what ecophysiological impact they may confer.
Cyanide assimilation
A Nit1C gene cluster (nitHBCDEFG; contig 33) was identified as a VCR in MAG-AL1, but not in MAG-AL2 (Fig. 8a; Additional file 4: Dataset S3). This locus, which has previously been reported in cyanobacteria, belongs to branch 1 nitrilases that can function during nitrogen starvation to assimilate nitriles by hydrolysing them to ammonia (plus a carboxylic acid) [62,63,64,65,66,67]. Nit1C gene expression can be highly induced by cyanide and repressed by ammonium and is essential for growth when cyanide is the sole source of nitrogen [64, 65, 67].
The Nit1C cluster had significantly higher coverage (P ≤ 0.0002) in the oxic zone (81%), compared to the oxic-anoxic interface (41%) or anoxic zone (36%) (Fig. 8b and c; Additional file 4: Dataset S3). The littoral mats in Ace Lake contain diverse cyanobacteria as well as predatory ciliates and rotifers [22, 26], and the ability to produce cyanide is widespread among phylogenetically diverse cyanobacteria, possibly as a defence mechanism against grazers (e.g. ciliates, rotifers) [68]. It is therefore possible that the Ace Lake cyanobacteria in the littoral mats generate relatively high levels of free cyanide (HCN and CN−) in the oxic zone, with cyanate generated by abiotic cyanide oxidation. The lower Nit1C cluster coverage in the anoxic zone is consistent with this zone having relatively high levels of ammonium (which represses gene expression) [22, 64, 69, 70]. Cyanate transporter genes were not identified in MAG-AL1, but nitrate and nitrite transporters which were encoded could possibly function in the uptake of cyanate and cyanide [71, 72]. These data would be consistent with MAG-AL1 Nit1C genes being induced during nitrogen starvation and/or in the presence of cyanide, allowing Ca. Regnicoccus frigidus to assimilate free cyanide and nitriles as nitrogen sources. As bioavailable nitrogen is limiting in the oxic zone [22, 70], the Nit1C gene cluster would be expected to enhance the competitiveness of the Ca. Regnicoccus frigidus population that possess it.
Asparagine synthesis
The conversion of aspartate to asparagine can be catalysed by AsnB (glutamine-hydrolysing asparagine synthetase) using glutamine as the preferred substrate or ammonium [73]. An asnB gene (IMG gene ID: Ga0222690_10005105) was identified in 18 Ca. Regnicoccus frigidus MAGs (but not in MAG-AL1 or MAG-AL2). While the ammonium-dependent asparagine synthetase gene (asnA) was not identified in Ca. Regnicoccus frigidus MAGs, the capacity to use nitrate, nitrite and ammonia for glutamine production via the GS-GOGAT (glutamine synthetase-glutamate synthase) cycle was evident in Ca. Regnicoccus frigidus (Fig. 7).
Using one MAG that contained asnB (99.7% bin completeness, 3.6% bin contamination, Dec 2014, 12 m depth, 0.8-μm filter metagenome), significant coverage variation was found between the anoxic zone (specifically anoxic 2 and 3; average 22%) and the oxic-anoxic interface (6%) or the oxic zone (5%) (Fig. 9; Additional file 4: Dataset S3). In ammonium-rich environments, AsnB can catalyse the formation of asparagine [74], and may therefore enable the anoxic zone Ca. Regnicoccus frigidus population (where ammonium levels are high; Refs. [22, 69, 70]) to benefit by being able to assimilate ammonia using AsnB (Fig. 7). Conversely, in the nitrogen-limited oxic zone, by having a capacity to perform glutamine-dependent asparagine synthesis [75], the relatively small asnB population would be expected to have an improved capacity to utilise bioavailable nitrogen (Fig. 7). While less than half of the Ca. Regnicoccus frigidus population possessed asnB, the gene was consistently identified in metagenomes representing all lake strata (oxic, oxic-anoxic interface, anoxic) and time (2008 to 2014), indicating it was a stable feature of the population (Fig. 9).