- Open Access
Insights into the ecology, evolution, and metabolism of the widespread Woesearchaeotal lineages
Microbiome volume 6, Article number: 102 (2018)
As a recently discovered member of the DPANN superphylum, Woesearchaeota account for a wide diversity of 16S rRNA gene sequences, but their ecology, evolution, and metabolism remain largely unknown.
Here, we assembled 133 global clone libraries/studies and 19 publicly available genomes to profile these patterns for Woesearchaeota. Phylogenetic analysis shows a high diversity with 26 proposed subgroups for this recently discovered archaeal phylum, which are widely distributed in different biotopes but primarily in inland anoxic environments. Ecological patterns analysis and ancestor state reconstruction for specific subgroups reveal that oxic status of the environments is the key factor driving the distribution and evolutionary diversity of Woesearchaeota. A selective distribution to different biotopes and an adaptive colonization from anoxic to oxic environments can be proposed and supported by evidence of the presence of ferredoxin-dependent pathways in the genomes only from anoxic biotopes but not from oxic biotopes. Metabolic reconstructions support an anaerobic heterotrophic lifestyle with conspicuous metabolic deficiencies, suggesting the requirement for metabolic complementarity with other microbes. Both lineage abundance distribution and co-occurrence network analyses across diverse biotopes confirmed metabolic complementation and revealed a potential syntrophic relationship between Woesearchaeota and methanogens, which is supported by metabolic modeling. If correct, Woesearchaeota may impact methanogenesis in inland ecosystems.
The findings provide an ecological and evolutionary framework for Woesearchaeota at a global scale and indicate their potential ecological roles, especially in methanogenesis.
Archaea constitute a significant fraction of microbial biomass in the earth biosphere , playing crucial roles in global biogeochemical cycles [2,3,4]. The archaeal domain consisted originally only of cultural organisms (e.g., methanogens , hyperthermophiles , and halophiles ) that belong to two phyla, the Euryarchaeota and Crenarchaeota . However, over the past few decades, the new addition of numerous 16S rRNA gene sequences and archaeal genomes has dramatically expanded the archaeal tree, and at least 26 new archaeal phyla have been proposed . DNA sequencing techniques have greatly enriched the archaeal gene inventories  and uncovered important functions for Archaea in biogeochemical cycles , such as aerobic ammonia oxidation by Thaumarchaeota [12, 13].
Woesearchaeota (formerly named Deep-sea Hydrothermal Vent Euryarchaeota Group 6, DHVEG-6 ) were named to recognize the pioneering contribution of Carl Woese to archaeal phylogeny . In combination with Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanohaloarchaeota, and Nanoarchaeota, they form a monophyletic super-phylum proposed as DPANN [15, 16]. Currently, Woesearchaeota are widely found in diverse environments, such as groundwater , surface water , inland soil , marine sediments [19, 20], freshwater sediments , activated sludge , wetland , hypersaline lakes , estuaries , and deep-sea hydrothermal vents . Several Woesearchaeotal genomes with different completeness have also been detected from groundwater [15, 27], sediment [28, 29], and freshwater . Based on metabolic predications, Castelle et al. reported that Woesearchaeota can enable carbon and hydrogen metabolism under anoxic conditions, which might associate with symbiotic and/or fermentation-based lifestyles . Probst et al.  found that both the Calvin-Benson-Bassham cycle and the Wood-Ljungdahl pathway occurred most frequently, whereas the TCA cycle was little used in Woesearchaeota. They also noted that hydrogenase encoded in Woesearchaeota supports H2 as an important interspecies energy currency under high CO2 concentrations. Despite the abovementioned progress in Woesearchaeota, the distribution, biodiversity, and metabolism of this phylum still remain largely unknown. To date, there are more than 5000 high-quality Woesearchaeotal 16S rRNA gene sequences deposited in the SILVA SSU 128 database  and also some available metagenomic assembled genomes (MAGs) in the GenBank database . Therefore, these datasets enable the exploration of general ecological patterns, and the updated genomes help better understanding of the potential metabolic functions of different Woesearchaeotal lineages in the global biogeochemical cycles.
Here, we retrieved current publicly available archaeal 16S rRNA gene sequences to explore the ecological patterns of Woesearchaeota within archaeal communities in different biotopes worldwide and to identify potential subgroups based on the phylogenetic and evolutionary relationships of different Woesearchaeotal lineages. A combination of bioinformatics, co-occurrence network, and metabolic analysis was used to reveal the possible syntrophic and/or mutualistic interactions of Woesearchaeota and their potential ecological roles in the biogeochemical cycles.
Woesearchaeota are widely distributed in different biotopes with a high diversity
Woesearchaeotal 16S rRNA gene sequences from 133 public clone libraries (> 600 bp, 3584 sequences) were grouped into seven types of biotopes across the globe (Fig. 1a and Additional file 1: Table S1). Most of the samples (more than 80%) were from inland biotopes that occur in mid-latitude regions; fewer were from marine biotopes or high-latitude regions. The relative abundance of Woesearchaeota in inland biotopes (freshwater, freshwater sediment, and soil, in particular) was significantly higher than in marine biotopes (P < 0.05; Additional file 1: Figure S1a).
To further determine the distribution patterns of Woesearchaeota, natural samples from the 133 libraries were sorted into an ordination diagram based on the similarity of communities (Fig. 1b). Oxic biotopes were clearly separated from anoxic ones (R2 = 0.22, P < 0.001; Additional file 1: Figure S1b), suggesting that oxygen might be a significant factor shaping Woesearchaeotal communities.
Phylogenetic diversity (PD) indices (Fig. 1c) and rarefaction curves (Additional file 1: Figure S1c) were calculated for Woesearchaeota in each type of biotope. The rarefaction curves suggest that the Woesearchaeotal diversity is far from exhaustively sampled in anoxic biotopes, especially in freshwater sediments and extreme environments (i.e., hydrothermal vents and hypersaline biotopes). The PD value was much higher in freshwater sediment (Fsed) and hydrothermal vent (Hdv) biotopes, whereas the marine water column (Mwc) biotope held the lowest value (Fig. 1c). The mean phylogenetic species variability (PSV) value (0.45) was significantly lower than the null distribution for module 1 (0.63, P < 0.05) and for model 2 (0.58, P < 0.05; Material and Methods). Null model 1 suggests a phylogenetic clustering, indicating nonrandom relationships between phylotypes with biotopes containing more closely related phylotypes than forecast by chance [31, 32]. In contrast, null model 2 suggests a significant pattern in phylotype prevalence, where the composition of phylotypes represented nonrandom samples of the phylotype pool [31, 32]. Therefore, Fsed, Hdv, and Msed biotopes held the highest PSV values, indicating more over-dispersed phylotypes, whereas the Mwc and soil (S) biotopes had the lowest values, showing more phylogenetically clustered phylotypes. Altogether, these findings demonstrate a high biodiversity of Woesearchaeota.
Phylogenetic analysis reveals 26 potential subgroups within the Woesearchaeota
A total of 26 subgroups of Woesearchaeota (Woese-1 to Woese-26) were supported by both maximum likelihood (Fig. 2a and Additional file 2: Dataset S1) and Neighbor Joining distance (Additional file 1: Figure S2 and Additional file 3: Dataset S2) inferences. These subgroups covered up to 83.5% of the 3584 Woesearchaeotal 16S rRNA gene sequences, leaving 592 sequences ungrouped due to the currently incomplete topology of the phylogenetic tree (Fig. 2b and Additional file 1: Table S4). Generally, there were high intra-subgroup diversities, with the majority of subgroups holding ≤ 90% intra-group minimum similarities that ranged from 80 to 92%. The biggest sub-cluster Woese-5 split into three sister groups (Woese-5a, 5b, and 5c), each with an 80% intra-group similarity in the maximum likelihood tree (Fig. 2a). Notably, despite a stable topology of Woese-5 (bootstrap value: 92% in ML tree in Fig. 2a and 97.8% in NJ tree in Additional file 1: Figure S2), its internal branching order varied between treeing methods, so we proposed this group as a multi-furcation  that would be further updated after new sequences are available. Furthermore, other three subgroups (Woese-1, Woese-22, and Woese-23) split further into two sister groups.
Oxic status drives the distribution and evolutionary diversity of organisms from different Woesearchaeotal lineages
To better summarize the range of environmental factors shaping the distribution of Woesearchaeota, we retrieved and plotted significant environmental information (that is, oxic status and salinity which are the potential forces shaping Woesearchaeotal community) around the tree (Fig. 2a). In total, there were almost 80% representative sequences originating from anoxic environments and 20% from oxic biotopes (mostly like surface freshwater and marine water). This indicates that the oxic status of an environment would be one of the major factors shaping the Woesearchaeotal communities. Moreover, the representative sequences of many subgroups (i.e., Woese-1b, 2, 9, 12, 15, 17, 18, 22a, 25) were only observed in anoxic environments, and therefore, they may be indicator species for such biotopes, but further confirmation will be required after more Woesearchaeotal sequences are added from environmental samples. Interestingly, Woese-16 was limited to sequences retrieved from oxic and non-saline biotopes (e.g., surface freshwater). For salinity, representative sequences from non-saline biotopes account for 44.5% of representative sequences, leaving 31.2 and 24.3% from saline and hypersaline environments, respectively. Two small subgroups Woese-15 and Woese-1b were only composed of sequences from hypersaline biotopes, suggesting two potential hypersaline Woesearchaeotal subgroups. Notably, Woese-2 was only observed in saline or hypersaline environments. The smallest sub-cluster Woese-22a was restricted to sequences obtained from marine sediment. All others were mixed and therefore composed of sequences from oxic, anoxic, nonsaline, saline, or hypersaline biotopes. These findings reveal that the Woesearchaeotal subgroups might be selectively distributed in local biotopes.
To test the above hypotheses, we investigated the relative abundance of each subgroup throughout the 95 representative libraries (Fig. 3a and Additional file 1: Table S1). Ten of 26 Woesearchaeotal subgroups showed significantly different abundance patterns in relation to oxic status (1 oxic and 9 anoxic; P < 0.01). Particularly, Woese-1b, Woese-3, Woese-9 Woese-12, Woese-15, Woese-18, and Woese-22a were only observed in anoxic biotopes, whereas Woese-4 and Woese-16 appeared mainly in oxic environments (Fig. 3a), such as surface freshwater and marine water. Notably, the ungrouped Woesearchaeotal lineages tend to occur in both oxic freshwater and anoxic extreme environments, such as hypersaline sites and hydrothermal vents. Other subgroups appeared in oxic or anoxic environments, but mainly in anoxic biotopes. Overall, the analysis shows that not all 26 groups and subgroups are widely distributed across the seven biotopes, and patterns related to oxic status suggest that most Woesearchaeota are anaerobes or facultative anaerobes as previously described .
To further confirm the above conclusion, we further compared the influences of some primary environmental factors (e.g., temperature, salinity, and oxic status) on the Woesearchaeotal assemblages by sorting them into an ordination diagram based on the phylogenetic community similarity (Fig. 3b). The principal component analysis (PCA) provided evidence for oxic status to be the first strongest factor (R2 = 0.642, P < 0.01) that explained most of the assemblages of Woesearchaeota across the subsamples (95 representative libraries), compared with other environmental factors collected. Most of Woesearchaeotal assemblages in the representative libraries were affected by both “anoxic status” and “lifestyle” (that is, living in water column, soil, or sediment; R2 = 0.406, P < 0.01), instead of “temperature” or “salinity” (R2 < 0.004, P > 0.01 for both parameters).
Given the high phylogenetic diversities and selective distribution patterns of the 26 groups and subgroups, we wondered whether evolutionary distinct Woesearchaeota lineages occur in oxic and anoxic habitats. To test this hypothesis, the ancestral state of oxygen (see the “Methods” section) was reconstructed for Woesearchaeota by using the oxic status (oxic or anoxic state) of biotope of each OTU at a species level (Fig. 3c). The ancestor state reconstruction (ASR) analysis indicated a significant relationship between differentiation patterns and oxic or anoxic biotopes where Woesearchaeota occurred. Supposing an anoxic representative as the most potential last common ancestor, this analysis supported the hypothesis of an evolutionary progression for Woesearchaeotal lineages from anoxic to oxic biotopes. After the colonization of oxic biotopes in the first transition, subsequent diversifications resulted in an increase in the number of oxic Woesearchaeotal lineages, such as Woese-16 and Woese-22b. Meanwhile, the selective distribution of Woesearchaeotal subgroups in local biotopes further supports the existence of adaptive evolution to specific environmental features, especially oxic status (oxic or anoxic state, Fig. 3a).
Metabolic reconstruction indicates an anaerobic syntrophic lifestyle with conspicuous metabolic deficiencies
To explore metabolic capabilities, we analyzed 19 high-quality genomes of Woesearchaeota reconstructed from previous studies (Additional file 1: Table S2). This work expands on prior metabolic predictions of Castelle et al. for nine of these genomes (e.g., AR20 in Woese-3 and AR15 in Woese-18 etc.) . We assigned the 19 Woesearchaeotal members into subgroups based on their 16S rRNA gene sequences when they were available (marked in Fig. 2a and Additional file 1: Table S2). Generally, the comparative genomic analysis reveals that none of the 19 genomes appear to have the potential for a complete electron-transport chain, glycolysis, and tricarboxylic acid (TCA) cycle (Fig. 4a and Additional file 5: Dataset S4), consistent with the prior metabolic reconstructions for nine of these Woesearchaeota . Identification of genes that encode flagella proteins, such as flaI, flaJ, and flaK (Additional file 1: Figure S4), expands the finding of potential motility and/or adhesion previously noted in Woesearchaeota (AR20, a member of Woese-3), a capacity also predicted for certain Diapherotrites .
In terms of carbon metabolism, some members of Woese-3 and Woese-14 tend to utilize starch (e.g., alpha amylase) to generate glucose for glycolysis (Figs. 4a and 5a). All Woesearchaeotal genomes lacked the gene (pfk) encoding phosphofructokinase to catalyze the conversion of fructose-6P to fructose-1,6P (Figs. 4a and 5a). However, they have a complete pentose phosphate pathway that enables the transformation of fructose-6P to glyceraldehyde-3P. Most can achieve the subsequent conversion from glyceraldehyde-3P to phosphoenolpyruvate (PEP), except Woese-17 and Woese-21 (Figs. 4a and 5a). Nevertheless, members in Woese-14 and Woese-18 show the potential to convert glucose to fructose-6P and subsequently complete glycolysis, with the support of a complete pentose phosphate pathway.
Notably, all 19 genomes appear to lack the gene (porA/B) encoding pyruvate/2-oxoacid-ferredoxin oxidoreductase, but four (members of Woese-5a, 14, 17, and 18, respectively) likely use pyruvate dehydrogenase to produce acetyl-CoA, as previously noted by Castelle et al. . Members of subgroups Woese-14 and Woese-18, and Woese-18 particularly, appear to have the capacity of catalyzing some reactions of the TCA cycle, amino-acid synthesis, and folate C1 metabolism. However, after converting pyruvate to acetyl-CoA, members of these two subgroups appear not to ferment acetyl-CoA to ethanol or acetate, presumably because of the incompleteness of genomes. This process may be performed by some members of other subgroups, such as Woese-3 and the ungrouped Woesearchaeota. Generally, members of Woese-18 and Woese-21 have the gene (atoB) encoding acetoacetyl-CoA thiolase that converts acetyl-CoA to acetoacetyl-CoA. Apart from Woese-5a, members of other six subgroups (including the ungrouped Woesearchaeota) can provide intermediates (oxaloacetate and malate) for use in the TCA cycle or other biosynthetic pathways (e.g., amino acids conversion). For example, members in Woese-3, 14, and 17 appear to encode malate dehydrogenase to convert pyruvate to malate by fixing CO2 (Figs. 4a and 5a). Among them, only members in Woese-14 showed the potential to transform oxaloacetate to succinyl-CoA. Others may only metabolize α-ketoglutarate into succinyl-CoA or may participate in the branch of amino-acid metabolism from glutamic acid to glutamine.
Interestingly, genomes of three subgroups (i.e., Woese-5a, 14, and 21) encode potential D-3-phosphoglycerate dehydrogenase for the conversion of glycerate-3P into 3-phosphonooxypyruvate for amino-acid synthesis (Figs. 4a and 5a). However, this pathway appears to be blocked at the subsequent step because of the absence of phosphoserine aminotransferase in all 19 members of Woesearchaeota. Despite this, Woesearchaeota can continue the subsequent pathways to complete the biosynthesis of serine, glycine, and cysteine, and then provide a series of intermediates for methane metabolism. For example, members of three subgroups (Woese-3, 18, 21) show the potential to convert phosphoserine into serine which might be used to produce numerous intermediates for methane metabolism, such as potential 5-Methyl-THF and 5-Methyl-THMPT . We did not identify the gene (mcrA) encoding methyl-CoM reductase, indicating that Woesearchaeota are not involved in methanogenesis. Only one member (AR20)  in Woese-3 and another in the ungrouped lineages might perform partial assimilatory sulfate reduction of sulfate to 3′-Phosphoadenosine-5′-phosphosulfate (PAPS) in sulfur metabolism. A few genes, such as nirK and nosZ genes, involved in nitrogen cycling are determined among the 19 genomes of Woesearchaeota, which might possibly for detoxification as proposed in previous study .
Consistent with the previous predictions for the first nine Woesearchaeota , most have anaerobic metabolisms. Interestingly, we find that members of Woesearchaeota from anoxic biotopes possess genes involved in ferredoxin-dependent pathways that are not detected in Woesearchaeota from oxic biotopes (Fig. 4b). These results further provide potential metabolic evidence for the distribution diversification of Woesearchaeotal lineages on oxic status. On the other hand, significant metabolic deficiencies and some membrane-bound hydogenases for hydrogen metabolism [15, 27] indicate that Woesearchaeota, like most other DPANN archaea , might perform nutritional complementation with other organisms.
High co-occurrence with methanogenic archaea suggests the possibility of a syntrophic relationship
The archaeal lineage abundance distribution (LAD) showed a positive relationship between frequency of occurrence (i.e., number of libraries/studies where a specific archaeal lineage was detected) and mean relative abundance, indicating that cosmopolitan archaeal lineages were more abundant than those lineages that rarely occurred, except for Haloarchaea that appear in hypersaline ecosystems (Fig. 6a). In agreement with previous studies [35,36,37], archaeal lineages fall into two groups by differential occurrence: one group containing six anoxic core lineages (i.e., more frequent/abundant lineages observed in more than 90 libraries) that included Woesearchaeota that had already shown to be a core lineage in sediments , and another group composed of 40 rarer/less abundant satellite lineages (i.e., lineages observed in less than 50 libraries). Most frequent archaeal lineages were two anaerobic methanogenic archaeal lineages with occurrence rates of 87% (116 libraries) for Methanomicrobia and 83% (110 libraries) for Methanobacteria across the 133 libraries/studies where Woesearchaeota were detected. By mean relative abundance, Woesearchaeota were the most abundant archaeal lineage as they represented 25% of the OTUs in each library and the second most abundant were Methanomicrobia representing 20% of the OTUs in each library, where the latter were detected (Additional file 1: Table S5). The dispersion index against occurrence diagram (Fig. 6b) further supported that Woesearchaeota and other five core lineages were not randomly distributed through the libraries. Conversely, 37 of 40 satellite lineages were confirmed as randomly distributing lineages as their dispersion indices fell below the confidence of 2.5%.
After we confirmed that the LAD patterns of core archaeal lineages are certainly non-random, we further explored potential syntrophy for Woesearchaeota using co-occurrence network inference based on strong and significant Spearman correlations . The network (Fig. 6c, d) was composed of 302 nodes (i.e., OTUs) and 692 edges, which presented a typical topology for a microbial network [40,41,42]. Within this network, Woesearchaeota were the most frequent archaeal lineages occupying 38% of the nodes and 46% of the edges, followed by methanogens (that is, Methanomicrobia and Methanobacteria with nodes: 26% and edges: 30%, in total). The network indices for the first ten highest nodes also confirmed the critical role of Woesearchaeota and methanogens in the whole network structure (Additional file 1: Table S3). Moreover, the multivariate regression tree (MRT) showed that Methanomicrobia and Methanobacteria were the most frequent core lineages to be co-indicators together with Woesearchaeota for most anoxic biotopes in paddy soil, anoxic water, and cold sediments (Additional file 1: Figure S3).
Previous studies claimed that modules could be used as ecological niches when evaluating potential functions in biological communities [43, 44]. Thus, modularity analysis was carried out, and five different modules were individually generated by assembling the OTUs that had high connections within the corresponding module (Fig. 6e) . Woesearchaeota occurred in four of the five modules where they notably co-occurred more frequently with Methanomicrobia and Methanobacteria as indicated by higher degree values (that is, high level of interconnections, Fig. 6f). As suggested in the “Methods” section, each different module represented a different ecological niche [45, 46]. Therefore, the presence of Woesearchaeota-methanogens association in four different modules further supported a possible syntrophy rather than an overlap of ecological niche.
A survey of currently available data shows that the Woesearchaeota phylum is diverse, with at least 26 proposed subgroups distributed in different biotopes, but mainly in inland anoxic environments. Their abundance in inland over marine biotopes is in sharp contrast to current data that suggest that Bathyarchaeota and Marine Benthic Group-D are most common in marine environments [33, 47]. Our analyses suggest that the distribution of specific Woesearchaeotal subgroups associates strongly with specific biotopes, of which oxic status is the force driving the diversification and distribution of Woesearchaeota. Ferredoxin-dependent pathways consistently absent in organisms from oxic biotopes but present in organisms from anoxic biotopes metabolically support the properties of selective distribution in different environments or the corresponding adaptive strategies to oxic status of diverse biotopes (Fig. 4b).
Despite the wide distribution in seven distinct biotopes, metabolic potentials of the 19 Woesearchaeotal organisms indicate that they appear to mainly perform an anaerobic or fermentation-based lifestyle, because Woesearchaeota are found to lack a complete electron transport chain, a complete TCA cycle, and a continuous glycolysis while they do have a complete pentose phosphate pathway and some fermentation-based metabolisms. This feature would account for the ecological distribution patterns of Woesearchaeota mainly in the anoxic biotopes on Earth . On the one hand, we do not deny the possibility that the absence of a complete TCA cycle or other pathways is a bias of completeness of genomes. On the other hand, we have to acknowledge the fact that the available 19 genomes cover only a small fraction of the actual diversity of Woesearcheota. But the possibility would be very little that all of the 19 genomes (including two complete genomes CP010426 and CP010425 in Table S2) commonly miss the TCA cycle or other pathways (e.g., electron transport chain or the gene porA/B encoding pyruvate/2-oxoacid-ferredoxin oxidoreductase). Conversely, all of these genomes commonly have a complete pentose phosphate pathway that enables the transformation of fructose-6P to glyceraldehyde-3P, which compensates for the absence of pathways in the glycolysis. Therefore, these metabolic deficiencies common in the 19 Woesearchaeotal organisms might indicate a potential syntrophic and/or mutualistic partnership with other organisms. Consistently, Paul et al. reported that retroelement-guided protein diversification abounding in vast lineages of uncultivated DPANN superphylum (Woesearchaeota included) may provide them with a versatile tool used for adaptation to a dynamic, host-dependent existence . This nutritional dependency on other organisms would contribute to be the major challenges in the enrichment or pure cultivation of these lineages. Therefore, we suggest that the symbiosis-based approaches would benefit the success of further enrichment or pure cultivation of Woesearchaeota. This concept may also give some hints to future studies in uncultured microbes.
Co-occurrence patterns in microbial communities are typically explored to show ecological interactions between species [49, 50]. Therefore, one may wonder whether the Woesearchaeotal lineages tend to co-occur with other organisms. However, current knowledge about this is very limited. Therefore, a clear ecological relationship between Woesearchaeota and other organisms is extremely essential before assigning niches to the organisms of Woesearchaeotal lineages. For this purpose, the co-occurrence analysis has demonstrated the existence of potential consortia between Woesearchaeota and anaerobic methanogenic archaea as the core anoxic lineages in most biotopes (Fig. 6).
Given the potential of acetate fermentation and hydrogen metabolism, we therefore hypothesized the syntrophic metabolic model for a Woesearchaeota-methanogens consortium. We propose that Woesearchaeota probably support the growth of H2/CO2-using and acetate-using methanogens (Fig. 5b), enabling them to complete with hydrogenotrophic and acetotrophic methanogens [51, 52], respectively. In turn, Woesearchaeota may receive amino-acids and other compounds to compensate for their metabolic deficiencies with the help of transporters encoded in the genomes (Additional file 1: Figure S2). In addition to acetate and hydrogen, Woesearchaeota might provide methanogens with other byproducts. For example, methyl compounds in the folate C1 metabolism (Fig. 4a) might provide potential substrates for the methyl-reducing methanogens . Therefore, we speculate that Woesearchaeota as partners of methanogens might directly impact the type of methanogens present in the environment and the rate of methane formation. However, further confirmation of this hypothesis could be achieved using the FISH technique or stable isotopic labeling [53, 54].
Bacteria might also play a key in the metabolic partnership with Woesearchaeota or Woesearchaeota-methanogens consortia, but this information is restricted in our study. An ideal strategy is to collectively analyze the co-occurrence of Woesearchaeota with both other archaea and bacteria across the libraries collected, which may obtain a more complete partnership involving bacteria. However, we had difficulty in acquiring the full source from the GenBank database mainly because many early studies focused only on the single communities lacking the information of either bacteria or archaea. Therefore, future complex studies will help to explore the cross-kingdom (e.g., bacteria, fungi, and protists) interactions of Woesearchaeota to better our understanding of the ecological roles of Woesearchaeota. In addition, although our study has provided ecological and genetic evidence for Woesearchaeotal metabolisms, we acknowledge that more molecular tests for specific pathways should be continued to confirm the ecological roles of Woesearchaeota in the biogeochemical cycles in the future. Therefore, all such implications will give rise to a clear orientation to exploring the ecology, evolution, and metabolic potential of Woesearchaeota in further studies.
The Esearch Utility was introduced to retrieve and capture archaeal 16S rRNA gene sequences from the GenBank NCBI-nr database (by January 2017) matching the following string “16S AND 600:2000[Sequence Length] AND archaea[Organism] AND rrna[Feature key] AND isolation_source[All fields] NOT genome OR chromosome OR plasmid.” As a result, 129,622 archaeal 16S rRNA gene sequences were retrieved. Sequences that had low quality (e.g., not ribosomal or contained N in the base sequence) or lacked isolation source tags were re-checked and discarded. To retain only studies or libraries that contain Woesearchaeotal sequences, the remaining 122,559 archaeal sequences were BLASTed against SILVA SSU 128 [30, 55]. Libraries or studies with less than ten 16S rRNA gene sequences were discarded. We ended with 15,012 archaeal sequences (including 3584 Woesearchaeotal sequences longer than 600 bp) from 133 studies or libraries: 30 from freshwater (Fwc), 32 from freshwater sediment (Fsed), 22 from soil (S), 19 from marine sediment (Msed), 7 from marine water column (Mwc), 11 from hypersaline environment (Hsal), and 12 from hydrothermal vent (Hdv) (Additional file 1: Table S1). These sequences were clustered into 11,323 operational taxonomic units (OTUs) at 97% identity including 2180 OTUs belonging to Woesearchaeota (Additional file 1: Table S1).
Both neighbor-joining (see Additional file 6: Supplementary Methods) and RAxML analyses  were employed to construct the phylogenetic tree. Given both the topology of the tree and a good coverage of all Woesearchaeotal lineages, we determined to use 670 Woesearchaeotal representative OTU sequences that were longer than 800 bp (at 97% cutoff) to establish the phylogenetic tree. OTU representative sequences were aligned using SINA Alignment Service (https://www.arb-silva.de/aligner/) and imported into ARB software loaded  with the Greengenes database (Version gg_13_5, http://greengenes.secondgenome.com/downloads/database/13_5). To exclude highly variable positions, base frequency filtration was implemented by the parsimony quick add marked tool installed in ARB software before adding sequences to the maximum parsimony backbone tree. Partial sequences were discarded due to poor alignment quality or added to the tree by parsimony criteria without allowing any changes in the general topology of tree. The tree was finally constructed using 663 representative 16S rRNA gene sequences (Additional file 7: Dataset S5), including 14 sequences from the genomes (marked by stars in the tree or see Additional file 1: Table S2). Woesearchaeotal subgroup designations were made when one subgroup with > 10 representative sequences were monophyletic with both maximum likelihood and distance approaches . Environmental parameters (i.e., salinity and oxygen conditions) of each sequence in the tree were collected from the “isolation sources” information on the GenBank or the corresponding publications and drawn with iTOL .
In this study, the 16S rRNA gene dataset was first treated by QIIME before the downstream analyses . All the sequences were treated using two approaches: phylogenetic analysis (based on the evolutionary distances computed by the RAxML tree) and taxon-based method (where taxa were captured at a given level and subsequently treated as equally divergent).
Environmental parameters of each biotope were defined mainly by referring to the corresponding literatures if any (Additional file 1: Table S1). For oxygen, if not mentioned in original references, oxic biotopes mainly refer to oxic water or surface sediments while anoxic biotopes mainly indicate anoxic groundwater, deep-sea sediments or water, soil, and hydrothermal vents. To compare the effects of these parameters on the communities of Woesearchaeota, distance matrices were constructed based on UniFrac (a beta diversity metric quantifying community similarity) for principal coordinate analysis (PCoA) of the source variations of samples  and Bray Curtis (a metric balancing both membership and abundance of community) for principal component analysis (PCA) of different environmental parameters , respectively. The permutational Manova based on 1000 permutations with the function Adonis in R package vegan was used to examine the variation source in the UniFrac matrix . For PCA, Bray-Curtis dissimilarities were introduced to avoid the potential double nulls in Euclidean distances . Phylogenetic diversity (PD) for each of the seven kinds of biotopes was computed as the sum of the branch length associated with the Woesearchaeotal 16S rRNA gene sequences within the corresponding biotope . To reduce the bias for unequal number of sequences, the mean PD of 1000 randomized subsamples in each biotope was calculated and the SD was also shown . Moreover, the phylogenetic structure for each biotope was computed using the phylogenetic species variability (PSV) index . This index quantifies how phylogenetic closeness reduces the variance of a hypothetic neutral trait . The value 1 indicates all species are unrelated phylogenetically while 0 shows they become more related. In order to statistically test whether biotopes harbored Woesearchaeotal species that are more or less related to each other than expected, the mean PSV with distributions of mean null values (with an iteration of 1000) was compared based on two randomization procedures: null model 1 retains species occurrence and 2 retains biotope species abundance . Data were analyzed with the R package picante .
Taxon-based analysis was performed using QIIME based on the taxonomy of 16S rRNA reference database SILVA SSU 128 with Woesearchaeotal taxonomy updated by Additional file 8: Dataset S6. Subsequently, 11,323 archaeal OTUs (97% cutoff) belonging to 46 archaeal lineages (at a class level except Woesearchaeota at a phylum level) and from the 133 global studies/libraries were detected. A table of lineage relative abundance was reconstructed by considering the clusters subordinate to the main archaeal phyla and provided by default in the Greengenes tree . This table was then used to show the lineage abundance distribution (LAD) patterns of each lineage and further determine the ecological importance of the Woesearchaeotal lineage in archaeal communities. To find out whether lineages were distributed randomly (that is, Poisson distribution), the index of dispersion for each archaeal lineage was determined as the ratio of the variance to the average relative abundance multiplied by the occurrence . Lineages whose dispersion index falls between the 2.5 and 97.5% confidence interval of the χ2 distribution will follow a Poisson distribution .
A multivariate regression tree (MRT) was determined and generated by the R package mvpart to illustrate the relationship between the environmental matrix (Additional file 1: Table S1) and the table of lineage relative abundances . Meanwhile, we introduced the indicator value (IndVal) index, which combines relative occurrence frequency and relative abundance to identify archaeal lineages, like the concept of “indicator species.”
To determine the distribution patterns of each Woesearchaeotal subgroup throughout the libraries, the OTU table of Woesearchaeota was extracted by QIIME. The OTU relative abundance table of each subgroup was reconstructed, and the relative abundance of each subgroup was calculated as the sum of the abundance of Woesearchaeotal OTU subordinate to the corresponding subgroup. To ensure a good representativeness of subject, libraries with less than ten Woesearchaeotal representative sequences were discarded here. The distribution pattern of each subgroup (including ungrouped Woesearchaeota) based on relative abundance was finally visualized on a Heatmap plot by the R statistical package ggplot2. Clustering of libraries was based on the maximum likelihood module.
An ancestral state reconstruction (ASR) was performed to test the hypothesis of a relationship between biodiversity and oxygen condition in Woesearchaeota. For each Woesearchaeotal OTU (97% cutoff), character state for oxygen was coded as 1 = oxic and 2 = anoxic. ASR was implemented using Mesquite 2.75 with the maximum likelihood module .
Construction of OTU network
To identify the associations between Woesearchaeotal OTUs and other archaeal OTUs, the co-occurrence network was constructed based on the OTU occurrence frequency through the libraries. Pairwise score between OTUs represented by more than five sequences (1628 OTUs) was calculated using Spearman’s rank correlations. To ensure reliable networks, only co-occurrences satisfying a correlation with rho > 0.6 and P < 0.01 were considered for downstream analyses. According to random matrix theory (RMT) algorithm , the transition of the nearest neighbor spacing distribution (NNSD) of eigenvalues from Gaussian orthogonal ensemble (GOE) to Poisson distributions can be used as a threshold to test random co-occurrence patterns and remove random noises. A threshold of 0.90 was determined in this correlation network analysis. The network plots were visualized with the Cytoscape 2.8.3 software . Nodes indicated archaeal OTUs at 97% identity while edges indicated the significant correlations between OTUs. To profile the modularity, each network was divided into individual modules. The overall topological features of different networks were described using a set of indices including degree, closeness centrality, betweenness, node degree distribution, average node connectivity, average path length, diameter, clustering coefficient, and modularity (Additional file 1: Table S3).
Metabolic potential analysis
To explore the potential metabolic capacity of Woesearchaeota, we retrieved all their publicly available genomes from IMG/NCBI/JGI/EMBL-EBIG and finally obtained 19 genomes of Woesearchaeota for further analysis (by January 2017, Additional file 1: Table S2). Open reading frames (ORFs) were predicted on their genomic scaffolds using the metagenome mode of Prodigal and were annotated by the NCBI Prokaryotic Genome Annotation Pipeline and by the comparison with KEGG (Additional file 5: Dataset S4). Genome-based metabolic potential was determined by searching all predicted ORFs in a genome with the presence of corresponding enzymes. Metabolic processes that necessitate more than one enzyme are inferred from the presence of all corresponding enzymes (e.g., glycolysis). To infer the potential metabolic capacities of proposed subgroups of Woesearchaeota, we assigned the genomes to the corresponding groups based on their 16S rRNA gene sequences if available (marked with stars in the tree). To further identify the potential syntrophic interactions in which Woesearchaeota may be involved and, by association, in the potential metabolisms they may harbor, the key enzymes relating to metabolic processes (e.g., methanogenesis) in other high co-occurrence archaea were also explored.
We investigated the ecology, evolution, and metabolism of the widespread Woesearchaeotal lineages. The ecological and phylogenetic patterns showed that members of Woesearchaeota are widely distributed in different biotopes, with a high biodiversity that reveals 26 potential subgroups, and oxic status drives their distribution and evolutionary diversity. Metabolic reconstruction for Woesearchaeota indicated an anaerobic syntrophic lifestyle with conspicuous metabolic deficiencies. Meanwhile, high co-occurrence with methanogenic archaea suggested the possibility of a syntrophic relationship between these organisms. Our findings provide an ecological and evolutionary framework for Woesearchaeota at a global scale and indicate their potential ecological roles, especially in methanogenesis.
Lipp JS, Morono Y, Inagaki F, Hinrichs K-U. Significant contribution of Archaea to extant biomass in marine subsurface sediments. Nature. 2008;454:991–4.
Offre P, Spang A, Schleper C. Archaea in biogeochemical cycles. Annu Rev Microbiol. 2013;67:437–57.
Falkowski PG, Fenchel T, Delong EF. The microbial engines that drive Earth’s biogeochemical cycles. Science. 2008;320:1034–9.
Madsen EL. Microorganisms and their roles in fundamental biogeochemical cycles. Curr Opin Biotechnol. 2011;22:456–64.
Garcia J-L, Patel BK, Ollivier B. Taxonomic, phylogenetic, and ecological diversity of methanogenic Archaea. Anaerobe. 2000;6:205–26.
Stetter K, Huber R, Blöchl E, Kurr M. Hyperthermophilic archaea are thriving in deep North Sea and Alaskan oil reservoirs. Nature. 1993;365:743–5.
Oren A. The ecology of the extremely halophilic archaea. FEMS Microbiol Rev. 1994;13:415–39.
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87:4576–9.
Hug LA, Baker BJ, Anantharaman K, Brown CT, Probst AJ, Castelle CJ, et al. A new view of the tree of life. Nat Microbiol. 2016;1:16048.
Mardis ER. Next-generation DNA sequencing methods. Annu Rev Genom Hum Genet. 2008;9:387–402.
Tringe SG, Rubin EM. Metagenomics: DNA sequencing of environmental samples. Nat Rev Genet. 2005;6:805–14.
Könneke M, Bernhard AE, José R, Walker CB, Waterbury JB, Stahl DA. Isolation of an autotrophic ammonia-oxidizing marine archaeon. Nature. 2005;437:543–6.
Treusch AH, Leininger S, Kletzin A, Schuster SC, Klenk HP, Schleper C. Novel genes for nitrite reductase and Amo-related proteins indicate a role of uncultivated mesophilic crenarchaeota in nitrogen cycling. Environ Microbiol. 2005;7:1985–95.
Teske A, Sørensen KB. Uncultured archaea in deep marine subsurface sediments: have we caught them all? ISME J. 2008;2:3–18.
Castelle CJ, Wrighton KC, Thomas BC, Hug LA, Brown CT, Wilkins MJ, et al. Genomic expansion of domain archaea highlights roles for organisms from new phyla in anaerobic carbon cycling. Curr Biol. 2015;25:690–701.
Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, Cheng J-F, et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature. 2013;499:431–7.
Ortiz-Alvarez R, Casamayor EO. High occurrence of Pacearchaeota and Woesearchaeota (Archaea superphylum DPANN) in the surface waters of oligotrophic high-altitude lakes. Environ Microbiol Rep. 2016;8:210–7.
Wanga H, Liub J, Wangc J, Yud W, Xieb H, Wangd S, et al. Different microbial distributions in the Yellow River delta. Desalin Water Treat. 2017;75:70–8.
Choi H, Koh H-W, Kim H, Chae J-C, Park S-J. Microbial community composition in the marine sediments of Jeju Island: next-generation sequencing surveys. J Microbiol Biotechnol. 2016;26:883–90.
Lipsewers YA, Hopmans EC, Sinninghe Damsté JS, Villanueva L. Potential recycling of thaumarchaeotal lipids by DPANN Archaea in seasonally hypoxic surface marine sediments. Org Geochem. 2018;119:101–09.
Chen Y, Liu Y, Wang X. Spatiotemporal variation of bacterial and archaeal communities in sediments of a drinking reservoir, Beijing, China. Appl Microbiol Biotechnol. 2017;101:3379–91.
Xu D, Liu S, Chen Q, Ni J. Microbial community compositions in different functional zones of Carrousel oxidation ditch system for domestic wastewater treatment. AMB Express. 2017;7:40.
Long Y, Yi H, Chen S, Zhang Z, Cui K, Bing Y, et al. Influences of plant type on bacterial and archaeal communities in constructed wetland treating polluted river water. Environ Sci Pollut Res. 2016;23:19570–9.
Han R, Zhang X, Liu J, Long Q, Chen L, Liu D, et al. Microbial community structure and diversity within hypersaline Keke Salt Lake environments. Can J Microbiol. 2017;63:895–908.
Liu X, Pan J, Liu Y, Li M, Gu J-D. Diversity and distribution of Archaea in global estuarine ecosystems. Sci Total Environ. 2018;637-638:349–58.
Ding J, Zhang Y, Wang H, Jian H, Leng H, Xiao X. Microbial community structure of deep-sea hydrothermal vents on the ultraslow spreading Southwest Indian ridge. Front Microbio. 2017;8:1012.
Probst AJ, Castelle CJ, Singh A, Brown CT, Anantharaman K, Sharon I, et al. Genomic resolution of a cold subsurface aquifer community provides metabolic insights for novel microbes adapted to high CO2 concentrations. Environ Microbiol. 2017;19:459–74.
Anantharaman K, Breier JA, Dick GJ. Metagenomic resolution of microbial functions in deep-sea hydrothermal plumes across the Eastern Lau Spreading Center. ISME J. 2016;10:225–39.
Brown CT, Hug LA, Thomas BC, Sharon I, Castelle CJ, Singh A, et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature. 2015;523:208–U173.
Glöckner FO, Yilmaz P, Quast C, Gerken J, Beccati A, Ciuprina A, et al. 25 years of serving the community with ribosomal RNA gene reference databases and tools. J Biotechnol. 2017;261:169–76.
Auguet JC, Barberan A, Casamayor EO. Global ecological patterns in uncultured Archaea. ISME J. 2010;4:182–90.
Faith DP. Conservation evaluation and phylogenetic diversity. Biol Conserv. 1992;61:1–10.
Kubo K, Lloyd KG, J FB, Amann R, Teske A, Knittel K. Archaea of the Miscellaneous Crenarchaeotal Group are abundant, diverse and widespread in marine sediments. ISME J. 2012;6:1949–65.
Sorokin DY, Makarova KS, Abbas B, Ferrer M, Golyshin PN, Galinski EA, et al. Discovery of extremely halophilic, methyl-reducing euryarchaea provides insights into the evolutionary origin of methanogenesis. Nat Microbiol. 2017;2:17081.
Verberk W, van der Velde G, Esselink H. Explaining abundance-occupancy relationships in specialists and generalists: a case study on aquatic macroinvertebrates in standing waters. J Anim Ecol. 2010;79:589–601.
van der Gast CJ, Walker AW, Stressmann FA, Rogers GB, Scott P, Daniels TW, et al. Partitioning core and satellite taxa from within cystic fibrosis lung bacterial communities. ISME J. 2011;5:780–91.
Magurran AE, Henderson PA. Explaining the excess of rare species in natural species abundance distributions. Nature. 2003;422:714–6.
Fillol M, Auguet JC, Casamayor EO, Borrego CM. Insights in the ecology and evolutionary history of the Miscellaneous Crenarchaeotic Group lineage. ISME J. 2016;10:665–77.
Junker B, Schreiber F. Correlation networks. In: Analysis of biological networks; 2008.
Comte J, Lovejoy C, Crevecoeur S, Vincent WF. Co-occurrence patterns in aquatic bacterial communities across changing permafrost landscapes. Biogeosciences. 2016;13:175–90.
Deng Y, Jiang YH, Yang YF, He ZL, Luo F, Zhou JZ. Molecular ecological network analyses. BMC Bioinformatics. 2012;13:113.
Barberan A, Bates ST, Casamayor EO, Fierer N. Using network analysis to explore co-occurrence patterns in soil microbial communities. ISME J. 2012;6:343–51.
Freilich S, Kreimer A, Meilijson I, Gophna U, Sharan R, Ruppin E. The large-scale organization of the bacterial network of ecological co-occurrence interactions. Nucleic Acids Res. 2010;38:3857–68.
Faust K, Raes J. Microbial interactions: from networks to models. Nat Rev Microbiol. 2012;10:538–50.
Edelman A, Rao NR. Random matrix theory. Acta Numer. 2005;14:233–97.
Tang B, Wang Q, Yang M, Xie F, Zhu Y, Zhuo Y, et al. ContigScape: a Cytoscape plugin facilitating microbial genome gap closing. BMC Genomics. 2013;14:289.
Lloyd KG, Schreiber L, Petersen DG, Kjeldsen KU, Lever MA, Steen AD, et al. Predominant archaea in marine sediments degrade detrital proteins. Nature. 2013;496:215–8.
Paul BG, Burstein D, Castelle CJ, Handa S, Arambula D, Czornyj E, et al. Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea. Nat Microbiol. 2017;2:17045.
Horner-Devine MC, Silver JM, Leibold MA, Bohannan BJM, Colwell RK, Fuhrman JA, et al. A comparison of taxon co-occurrence patterns for macro- and microorganisms. Ecology. 2007;88:1345–53.
Steele JA, Countway PD, Xia L, Vigil PD, Beman JM, Kim DY, et al. Marine bacterial, archaeal and protistan association networks reveal ecological linkages. ISME J. 2011;5:1414–25.
Mayumi D, Dolfing J, Sakata S, Maeda H, Miyagawa Y, Ikarashi M, et al. Carbon dioxide concentration dictates alternative methanogenic pathways in oil reservoirs. Nat Commun. 2013;4:1998.
Morris BE, Herbst FA, Bastida F, Seifert J, von Bergen M, Richnow HH, et al. Microbial interactions during residual oil and n-fatty acid metabolism by a methanogenic consortium. Environ Microbiol Rep. 2012;4:297–306.
McInerney MJ, Sieber JR, Gunsalus RP. Syntrophy in anaerobic global carbon cycles. Curr Opin Biotechnol. 2009;20:623–32.
Morris BE, Henneberger R, Huber H, Moissl-Eichinger C. Microbial syntrophy: interaction for the common good. FEMS Microbiol Rev. 2013;37:384–406.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.
Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, et al. ARB: a software environment for sequence data. Nucleic Acids Res. 2004;32:1363–71.
Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 2016;44:W242–W45.
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.
Baselga A. Partitioning abundance-based multiple-site dissimilarity into components: balanced variation in abundance and abundance gradients. Methods Ecol Evol. 2017;8:799–808.
McArdle BH, Anderson MJ. Fitting multivariate models to community data: a comment on distance-based redundancy analysis. Ecology. 2001;82:290–7.
Helmus MR, Bland TJ, Williams CK, Ives AR. Phylogenetic measures of biodiversity. Am Nat. 2007;169:E68–83.
Kembel SW, Cowan PD, Helmus MR, Cornwell WK, Morlon H, Ackerly DD, et al. Picante: R tools for integrating phylogenies and ecology. Bioinformatics. 2010;26:1463–4.
McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, et al. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2012;6:610–8.
Krebs CJ. Ecological methodology 2nd edn. Addison-Wesley Educational Publishers, Inc: New York, NY, USA Harper & Row New York; 1989.
De'Ath G. Multivariate regression trees: a new technique for modeling species–environment relationships. Ecology. 2002;83:1105–17.
Maddison WP, Maddison DR. Mesquite: a modular system for evolutionary analysis. 2001.
We greatly acknowledge all the authors who provided valuable data for this study. We are also thankful to editors and anonymous reviewers for valuable feedbacks and constructive comments.
This work was supported by the Ph.D. Scholarship of The University of Hong Kong (XL), the National Natural Science Foundation of China (grant no. 31622002, 41506163, 31600093, and 31700430), the Science and Technology Innovation Committee of Shenzhen (grant no. JCYJ20170818091727570), and the Key Project of Department of Education of Guangdong Province (No.2017KZDXM071) (ML).
Availability of data and materials
Ethics approval and consent to participate
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Figures 1-4 and Tables 1-5. (PDF 18534 kb)
Supplementary Methods. (DOCX 23 kb)
Dataset S6 Subgroup-assigned taxonomy of 663 representative 16S rRNA sequences (Woese_97_taxonomy_7_levels) used for determining the distribution patterns of each subgroup in the seven types of habitats throughout 133 libraries/studies in main text Fig. 3a and Additional file 1: Figure S3. (TXT 106 kb)