Skip to main content


  • Research
  • Open Access

The chemodiversity of paddy soil dissolved organic matter correlates with microbial community at continental scales

  • 1,
  • 2,
  • 3, 4,
  • 5,
  • 1,
  • 6,
  • 1,
  • 1, 7,
  • 1, 7,
  • 1,
  • 3, 4,
  • 3, 4,
  • 8,
  • 5,
  • 3, 4Email author and
  • 1, 7, 9Email author
Contributed equally

  • Received: 28 March 2018
  • Accepted: 20 September 2018
  • Published:



Paddy soil dissolved organic matter (DOM) represents a major hotspot for soil biogeochemistry, yet we know little about its chemodiversity let alone the microbial community that shapes it. Here, we leveraged ultrahigh-resolution mass spectrometry, amplicon, and metagenomic sequencing to characterize the molecular distribution of DOM and the taxonomic and functional microbial diversity in paddy soils across China. We hypothesized that variances in microbial community significantly associate with changes in soil DOM molecular composition.


We report that both microbial and DOM profiles revealed geographic patterns that were associated with variation in mean monthly precipitation, mean annual temperature, and pH. DOM molecular diversity was significantly correlated with microbial taxonomic diversity. An increase in DOM molecules categorized as peptides, carbohydrates, and unsaturated aliphatics, and a decrease in those belonging to polyphenolics and polycyclic aromatics, significantly correlated with proportional changes in some of the microbial taxa, such as Syntrophobacterales, Thermoleophilia, Geobacter, Spirochaeta, Gaiella, and Defluviicoccus. DOM composition was also associated with the relative abundances of the microbial metabolic pathways, such as anaerobic carbon fixation, glycolysis, lignolysis, fermentation, and methanogenesis.


Our study demonstrates the continental-scale distribution of DOM is significantly correlated with the taxonomic profile and metabolic potential of the rice paddy microbiome. Abiotic factors that have a distinct effect on community structure can also influence the chemodiversity of DOM and vice versa. Deciphering these associations and the underlying mechanisms can precipitate understanding of the complex ecology of paddy soils, as well as help assess the effects of human activities on biogeochemistry and greenhouse gas emissions in paddy soils.


  • Dissolved organic matter
  • Paddy soil
  • Chemodiversity
  • Microbial diversity


Paddy fields, 90% of which are in Asia, feed more than half of the world’s population [1]. The continuous flooding in bunded fields of cultivated rice (Oryza sativa) utilizes 24–30% of the world’s developed freshwater resources and represents one of the major sources of inland aquatic dissolved organic matter (DOM) [1, 2]. High concentrations and fluxes of DOM from plant debris during flooding seasons trigger microbial activity, while anaerobic conditions stabilize DOM against microbial decay via interactions with clay minerals and iron oxides [1, 3]. DOM plays a central role in biogeochemical processes in both flooded and unflooded paddy soils, as well as an active role in the global carbon cycle [1, 4].

Recently, the evidence-based soil continuum model questioned the secondary synthesis of “humic substances,” or the “humification,” and interpreted organic debris as a unique source of soil organic matter (SOM) and DOM [5]. This theory emphasized the inherent association between soil microbial metabolism and DOM heterogeneity. Microbes play an important role in carbon and nitrogen cycling as well as methane production and consumption in soils [1, 6, 7], which influences carbon balance, greenhouse gas production, crop productivity, and water eutrophication [8]. Therefore, a growing body of research has focused on the biogeography of microbial communities [911]. However, despite efforts to characterize the drivers of DOM concentration dynamics [8, 1215], no attempts have been made to combine these data so as to understand the associated properties of each and the environmental factors that drive them. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) enables detailed characterization of DOM molecular distribution [13, 16, 17]. This approach has been applied to marine [18, 19] and inland water [4, 20, 21], and a handful of comparative experiments have examined soil DOM at the molecular level [3, 14, 15, 22]; however, the microbial taxonomic and metabolic structures that influence the molecular distribution of soil DOM remain unknown.

To our knowledge, no comprehensive study has yet been performed to elucidate the natural relationship between microbial metabolisms and DOM molecular distribution in paddy soils on continental scales. We apply FT-ICR-MS plus amplicon and metagenomic sequencing to characterize the association between microbial community structure and function with DOM molecular composition in flooded paddy soils. We hypothesized that taxonomic and functional composition of soil microbial communities is significantly associated with DOM molecular composition in paddy soils and, moreover, that geographic and edaphic factors significantly affect this interdependence.


Microbial and DOM biogeography in paddy soils

Across four rice-growing regions in China, we collected soil samples from 88 flooded paddy fields, wherein most rice plants were at the tillering phase (Fig. 1 and Table 1). Based on 16S rRNA gene sequencing, Anaeromyxobacter (1.9% ± 0.9%), Geobacter (1.0% ± 0.5%), Anaerolinea (0.9% ± 0.6%), and Haliangium (0.8% ± 0.3%) were the most abundant genera. Bacterial richness (Chao1 and observed species) and diversity (Shannon and PD whole tree) were significantly different between regions (Additional file 1: Figure S1A), with the lowest diversity and richness observed in samples from Sanjiang Plain (P < 0.05, Dunn’s test). Microbial β-diversity (variance adjusted weighted UniFrac; VAW-UniFrac) was significantly correlated to the distance between sites (Mantel r = 0.52, P < 0.001; Fig. 2a). The proportions of the dominant taxa at each phylogenetic level differed significantly by region (Additional file 1: Figure S2). Among the 20 most abundant genera, only Gemmatimonas and Pseudolabrys were stably abundant across regions, while the other genera were significantly different between regions (Fig. 2b).
Fig. 1
Fig. 1

Map of sampling sites (dark-red dots) across four typical Chinese rice-growing regions (red dotted ellipses). It shows broad patterns of rice field. The map was colored to depict circa the year 2000 area (harvested) and yield of rice crops of China [90]

Table 1

Various characteristics of the four typical Chinese rice-growing regions


Hani Terrace

Taihu Plain

Sanjiang Plain

Lianghu Plain

Administrative region

Yunnan Province.

Jiangsu, Zhejiang, Anhui Provinces, and Shanghai Municipality

Heilongjiang Province

Hunan and Hubei Provinces

Longitude range





Latitude range






3-D climate, (sub)tropics




Elevation (m)





Rice cultivation history

~ 1200 years

~ 5000 years

~ 60 years

~ 4000 years

Rice cultivars (Oryza sativa L.) [89]

Most are japonica

Half are indica and half are japonica

All are japonica

Most are indica

Cultivation method


modern and small-scale

modern and large-scale

modern and medium-scale

Mean pH

6.07 ± 0.60

6.60 ± 0.70

6.32 ± 0.32

6.80 ± 0.68

Mean annual temperature (°C)

16.21 ± 1.77

15.66 ± 0.41

2.85 ± 0.78

16.68 ± 0.35

Mean annual precipitation (mm/d)

3.20 ± 0.19

3.74 ± 0.19

1.86 ± 0.13

3.45 ± 0.29

Dissolved organic carbon (mg/ml)

23.51 ± 6.84

31.68 ± 11.41

29.28 ± 8.39

31.61 ± 14.40

Fig. 2
Fig. 2

Distributions of DOM molecules and bacteria across 88 paddy soils from four regions. a Non-metric multidimensional scaling of bacterial OTUs (matrix: variance adjusted weighted UniFrac, stress = 0.1918). b Boxplots showing regional differences of top 20 genera. Genera with significant differences (Kruskal-Wallis test) were labeled with "*" (P < 0.05), "**" (P < 0.01), and "***" (P < 0.001). For each genus, paired boxes containing no same letter are considered to be significantly different (P < 0.05, Dunn’s test). c Non-metric multidimensional scaling of DOM molecules (matrix: Bray-Curtis, stress = 0.1479). d Kernel density plots of DOM formula in van Krevelen diagrams. Darker color indicates a higher molecular density. Accumulated areas α, β, ɣ, δ, and ε are considered to be clusters of highly unsaturated/aromatic hydrocarbon, phenolics, polycyclic aromatics, polyphenols, and unsaturated aliphatics or peptides, respectively

DOM analysis revealed 81,759 compounds in total with an average of 8262 ± 1187 compounds at each site (Additional file 2: Table S1). A core group of 18,538 molecules was observed in at least 10 sites; 12,791 of these compounds could be assigned putative molecular formulae. Chao1 of DOM molecules was significantly different between the four regions (Additional file 1: Figure S1B), while Observed species and Shannon diversity were not. β-diversity (Bray-Curtis) of DOM was still significantly associated with distance between sites (Mantel r = 0.24, P < 0.001; Fig. 2c). Weighted density plots of DOM components in van Krevelen diagrams (Fig. 2d) visualized the significant influence of geographic region on the abundance of DOM molecular groups (permutational multivariate analysis of variance, PERMANOVA r2 = 0.17, P < 0.001). For example, unsaturated/aromatic hydrocarbons were enriched in Hani Terrace and Lianghu Plain samples (area α), phenolics were enriched in the Hani Terrace samples (area β), polycyclic aromatics were enriched in Taihu Plain samples (area ɣ), and polyphenols were enriched in Sanjiang Plain and Taihu Plain samples (area δ). Unsaturated aliphatics and peptides were enriched in samples of Hani Terrace (area ε).

Canonical correspondence analysis (CCA) and partial CCA were performed to estimate the contribution of environmental factors to the variance in microbial and DOM diversity across sites. The variance in microbial community was best explained by the variance in DOM composition (27%; Fig. 3a and Additional file 3: Table S2), with the converse being true as well (26.8%; Fig. 3b and Additional file 4: Table S3). This was confirmed using Procrustes analysis, which demonstrated that the dissimilarity of DOM and microbial communities between samples was significantly and strongly correlated (m122 = 0.66, PMonte Carlo < 0.001). More specifically, the second principal coordinate (PCo2) of the microbial Bray-Curtis (5.5%) and VAW-UniFrac distances (5.9%) as well as functional potentials (3.0%–4.5%) were the main contributors of DOM variance; PCo1 of DOM Bray-Curtis distance (4.3%) was the main contributor of microbial variance, followed by DOM alpha diversity indicies and PCo1–2 of phenolics, peptides, polyphenols, and polycyclic aromatic (in the column of V.E. CCA and V.E. pCCA, Additional file 3: Table S2 and Additional file 4: Table S3). Individual edaphic factors of pH (3.5%), conductivity (3.7%), real-time air temperature (RAT; 3.5%), real-time soil temperature (RST; 4.6%), tiller number (4.3%), and all geographic factors (4.4–6.9%) also described microbial variance (Additional file 3: Table S2). It should be noted that conductivity was used to quantify water-soluble ions [23]; tiller number and plant height were used to roughly indicate the rice growth stage and the size of plants, and they were classified into edaphic factors for simplicity. Meanwhile, edaphic factors of pH (3.4%), dissolved organic carbon (DOC; 3.3%), and geographic factors of elevation (4.4%), latitude (4.3%), longitude (4.1%), and mean monthly precipitation (MMP; 4.3%) and mean annual temperature (MAT; 3.2%) also described the variance in DOM (Additional file 4: Table S3).
Fig. 3
Fig. 3

Associations between DOM composition, bacterial community, and the environmental drivers. a The influences of geographical factors, edaphic factors, and DOM composition features on the bacterial community structure estimated via canonical correspondence analyses (CCA). b The influences of geographical factors, edaphic factors, and bacteria community features on the DOM composition estimated via CCA. The percentages represent the variance explained. c, d Multivariate analysis of microbial or molecular data and drivers using non-metric multidimensional scaling (NMDS). Ordinations are based on Bray-Curtis (c, stress = 0.1656; d, stress = 0.1479). Geographical factors, edaphic factors, and DOM composition (c) or bacteria (d) community were fit to the ordination using envfit function, respectively. Only factors with significance level ≤ 0.001 were shown. e Spearman’s rank correlations between DOM diversity features (y axis) and edaphic, geographical, and bacterial factors (x axis), with color coded in blue, red, and yellow, respectively. All factors imported and their influences are listed in Additional file 3: Table S2 and Additional file 4: Table S3. MAP, mean annual precipitation; MAT, mean annual temperature; MMP, mean monthly precipitation; MMT, mean annual temperature; RST, real-time soil temperature; RAT, real-time air temperature

To confirm these findings, we fitted these factors to unconstrained non-metric multidimensional scaling (NMDS) ordination (Fig. 3c, d, Additional file 3: Table S2 and Additional file 4: Table S3). Significant correlations (P ≤ 0.001) were observed between the variances of microbial/DOM composition and MMP (r2 = 0.637/0.355), MAT (r2 = 0.742/0.200), pH (r2 = 0.407/0.409), and elevation (r2 = 0.342/0.347), as well as latitude (r2 = 0.613/0.358) and longitude (r2 = 0.553/0.305; Additional file 3: Table S2 and Additional file 4: Table S3). The associations between edaphic, geographical, bacterial (x-axis), and DOM (y-axis) factors were calculated using Spearman’s rank correlation (Fig. 3e). The PCo2 of the microbial Bray-Curtis and VAW-UniFrac distances, latitude, longitude, MMP, MAT, pH, and predicted functional potential were all strongly correlated with the PCo1 of DOM (P < 0.001; Fig. 3e and Additional file 4: Table S3). MMP, as well as the second coordinates (PCo2) of microbial Bray-Curtis and VAW-UniFrac distances, described the geographic variance in DOM (PCo1) and DOM features, while pH, and the PCo1 of microbial distances, described another pattern of DOM variance (PCo2) (Fig. 3e). Precipitation was positively correlated with the relative abundance of carbohydrates and peptides while negatively correlated with the relative abundance of polycyclic aromatic compounds (Fig. 3e and Additional file 1: Figure S3).

Characterizing the association between DOM molecular and bacterial taxonomic composition

We performed canonical correlation analysis (CCorA) to characterize the association between bacterial and DOM composition (Fig. 4a, b). Using the PCo1–5 axes of DOM and microbial Bray-Curtis matrices, we observed four out of five canonical axes with significant correlation coefficients (P < 0.01, chi-square test), which confirmed the multi-dimensional associations between bacterial community and DOM composition (P = 0.0001; Fig. 4a). Each pair of ordinations along canonical axis represents two correlated dimensions of the multivariate matrices of microbial community and DOM composition (e.g., Spearman’s rank correlation ρ = 0.89, P = 2.2 × 10−16 for the first pair; Fig. 4b).
Fig. 4
Fig. 4

Associations between DOM composition and bacterial community. Canonical correlation analysis was conducted using first five principal coordinates (PCo1–5). a Combined score plots of the corresponding ordinations pairs of DOM composition and microbial community along the first and second canonical axes. The lengths of connecting lines represent the dissimilarities of DOM composition and microbial community. b Plot showing loading coefficients of the five pairs of principal coordinates imported on the corresponding first two pairs of canonical axes. c, d van Krevelen plots of DOM molecules showing positive and negative Spearman’s rank correlations with the DOM ordinations along the first (c) and second (d) canonical axes, indicating their association with microbial community dynamics. e Taxonomic cladogram showing positive and negative correlations with microbial ordinations along the first two canonical axes, indicating their association with DOM compositional changes. Rings of the cladogram provide a heatmap of the genera (> 0.01%) among the sampling regions with red and blue meaning more and less accumulated, respectively. Relative abundances of the taxa or genera are shown as the clade marker size or level 5 ring height. The color gradient bars in c and d, “purple to green” (for the first canonical axis) and “orange red to blue” (for the second canonical axis), indicate the values of coefficients. These color scales are also applied to the associations between taxa and microbial ordinations along the first and second canonical axes (e). Compounds category m: polycyclic aromatics; n: polyphenols and polycyclic aromatics with aliphatic chains; o: phenolic and highly unsaturated compounds; p: unsaturated aliphatics and aromatics with aliphatic chains; q: saturated fatty, sulfonic acids, and carbohydrates; r: N-containing compounds, i.e., peptides

DOM molecules and microbial taxa that were responsible for the first two pairs of correlated ordinations of DOM and microbial community were figured out using Spearman’s rank correlation test (Fig. 4c–e). Bacterial taxa showing significant correlation in the first canonical axis were enriched in Hani Terrace samples and attenuated in Sanjiang Plain samples. These included organisms (e.g., Geobacter, Syntrophorhabdus, and Spirochaeta) (Fig. 4e) that were positively correlated with highly unsaturated/aromatic hydrocarbon (Fig. 4c; area α in Fig. 2d) and phenolics (area β) and negatively correlated with polycyclic aromatics (area ɣ) and polyphenols (area δ). However, some taxa (e.g., Gaiella and Defluviicoccus) demonstrated an opposite trend (Fig. 4e). Also, other taxa and DOM compounds were correlated in the second canonical axis (Fig. 4d, e).

Factors of the covariation between microbial taxa and DOM molecules

To determine the factors of covariation between the microbiome and DOM, we correlated MMP, MAT, and pH against the DOM and microbial community ordinations along the first two canonical axes using Spearman’s rank correlation test. For covariation along the first axis (Fig. 4b), MMP and pH showed strong and significant correlations (ρ = − 0.68/− 0.65, P = 3.85 × 10−13/9.03 × 10−12 for pH with DOM/microbial ordination; ρ = 0.52/0.46, P = 2.17 × 10−7/6.48 × 10−6 for MMP with DOM/microbial ordination), while MAT was not correlated (P > 0.1). For covariation along the second axis (Fig. 4b), MAT, MMP, and pH showed strong and significant correlations (ρ = − 0.40/− 0.59, P = 1.01 × 10−4/1.33 × 10−9 for pH with DOM/microbial ordination; ρ = − 0.51/− 0.50, P = 3.93 × 10−7/8.01 × 10−7 for MMP with DOM/microbial ordination; ρ = − 0.71/− 0.72, P = 9.50 × 10−15/2.30 × 10−15 for MAT with DOM/microbial ordination). In our study, since MMP and MAT changed in opposite trends along the increased elevation in Hani Terrace sites, MMP and MAT, as well as pH, were not correlated with each other (P > 0.05).

DOM composition correlates with microbial functional potential

To characterize the microbial functional potential, four samples from each of the four regions were selected for deep shotgun metagenomic sequencing. The most abundant genera were Streptomyces (3.8%), Anaeromyxobacter (3.2%), Bradyrhizobium (3.0%), Mycobacterium (1.7%), Solibacter (1.3%), and Geobacter (1.2%). The abundance of this particular collection of taxa roughly paralleled what was previously observed in the 16S rRNA analysis (for details see Additional file 5 and Additional file 1: Figure S4).

The relationship between DOM composition and the relative abundance of metagenomic functional genes was determined by CCorA using PCo1–5 axes of DOM and PCo1–2 axes of FOAM (Functional Ontology Assignments for Metagenomes, a functional gene database) orthologs (explained 70% and 65% variances, respectively). We observed two canonical axes with significant correlation coefficients (P < 0.01, chi-square test), which confirmed the association between microbial function potential and DOM composition (P = 0.012, r1 = 0.89, r2 = 0.76). The association between DOM and FOAM orthologs (Fig. 5a) followed a similar trend to the association between DOM and microbial taxonomy (Fig. 4c, d). Redundancy analysis (built-in function of CCorA) indicated that the functional potential distribution could be predicted by the DOM distribution with an adjusted R2 of 0.70, while the distribution of DOM was correlated to functional potentials with an adjusted R2 of 0.35. Consistently, 109 out of 129 observed FOAM level 2 functional pathways were significantly (|r| > 0.5, P < 0.05) correlated with the variances of DOM in the first two canonical axes (Additional file 6: Table S4).
Fig. 5
Fig. 5

Associations between soil DOM composition and microbial functional capabilities estimated by metagenomics. a van Krevelen plot showing the positive and negative Spearman’s rank correlations of DOM molecules with the first canonical axis of canonical correlation analysis which indicates the correlation between DOM composition and FOAM orthologs. “Blue to red” gradient colors indicate the values of the coefficients. b Pearson’s correlation coefficient between relative abundances of biogeochemical functions (y axis) and DOM diversity features (x axis). Significant correlations (P < 0.05) are indicated by the black boxes. The first column of b shows the positive and negative Pearson’s correlation coefficients between biogeochemical functions and the first canonical axis. c FOAM orthologs correlated with DOM distributions, regarding methanogenesis in KEGG module M00567 and M00357. The protein catalog for quantification was restricted to methanogens (Additional file 9: Table S7) using Kaiju [70]. Moreover, it was further restricted to Methanothrix and Methanosarcina for aceticlastic methanogenesis. The red, blue, and yellow colors indicate significantly (P < 0.05) positive, negative, and nonsignificant (P > 0.05) Pearson’s correlation coefficients between FOAM orthologs and the DOM variance along the first canonical axis, respectively. Gray color indicates orthologs detected in less than eight samples, whereas white indicates orthologs undetected

To avoid risk of false correlations, we further used marker genes of certain biogeochemical functions to qualitatively evaluate the associations (Fig. 5b). Consistently, functional genes involved in glycolysis (Embden-Meyerhof-Parnos, EMP), anaerobic C-fixation (light-independent), pyruvate fermentation (to butyrate), methanogenesis, methane oxidation (soluble methane monooxygenase), and hydrogen metabolism, as well as H4MPT-linked C1 transfer pathway (hydrogenotrophic methanogenesis associated), showed positive correlation with compositional changes in DOM along the first canonical axis, whereas those related to pyruvate fermentation (to lactate), carbon monoxide oxidation, urea degradation, nitrite oxidation, dissimilatory nitrite reduction (nirK), and ligninase, as well as the caa3-type and bo-type cytochrome oxidases, were negatively correlated (Fig. 5b). Compositional changes of DOM along the first canonical axis were characterized by increases in highly unsaturated/aromatic hydrocarbon and phenolics, peptides, and unsaturated aliphatics (O/C < 0.5), as well as decreases in polycyclic aromatics and polyphenols (Fig. 5a, upper panel). FOAM function categories associated with the tricarboxylic acid (TCA) cycle and homoacetogenesis were positively correlated with compositional change of DOM along the first canonical axis, whereas hydrocarbon degradation, cellular response to oxidative stress, and fatty acid oxidation pathways were negatively correlated (Additional file 6: Table S4).

To confirm the findings from this limited metagenomic dataset, Tax4fun [24] was used to predict the abundance of functional genes based on the 16S rRNA amplicon data. The correlation between predicted functional potential and DOM was mostly consistent with the correlation observed for the real metagenomic data (Additional file 5 and Additional file 7: Table S5). The metagenomic taxa that correlated with DOM along the first canonical axis of Fig. 4c also showed significant correlation with DOM along the first canonical axis of Fig. 5a. These include Geobacter (Pearson’s correlation coefficient, r = 0.83, P = 6.32 × 10−5), Syntrophobacterales (r = 0.94, P = 5.98 × 10−8), Spirochaeta (r = 0.92, P = 3.23 × 10−7), Thermoleophilia (r = − 0.62, P = 1.01 × 10−2), and Gaiella (r = − 0.72, P = 1.53 × 10−3).

The relative abundance of methanogens (0.43%) as predicted from 16S rRNA analysis had the same correlation with the DOM variance along the first canonical axis (Pearson’s correlation coefficient r = 0.93, P = 2.16 × 10−7), as the genes encoding for methanogenesis (Fig. 5c). Despite their distinctly different substrate range, we found that all methanogenic genera correlated with DOM variance along the first canonical axis (Additional file 8: Table S6). Consistently, both functional ortholog groups (FOAM orthologs) of hydrogenotrophic and aceticlastic methanogenesis showed significant correlation with DOM variance along the first canonical axis (Fig. 5c and Additional file 6: Table S4). Strikingly, methanogenesis via CO2 reduction (TMP = 4.54, K00320) had a greater functional potential than aceticlastic methanogenesis (TMP = 1.93, K00193, K00194, K00197) in all paddy sites. Methanogenesis can be performed by syntrophic methanogenic consortia via direct interspecies electron transfer (DIET), mediated by electrically conductive pili (e-pili) or biochar and other conductive materials [25, 26]. So we calculated the Pearson’s correlation coefficient between Fe(III)-reducing bacteria with/without e-pilin encoding genes and the first canonical axis of DOM (Additional file 9: Table S7). Seventeen out of 19 bacteria encoding e-pilin were significantly correlated with DOM variance along the first canonical axis, while 35 out of 70 bacteria without e-pili were also significantly correlated. Consistently, electrically conductive pilin (e-pilin, 46.6%) was significantly positively correlated to DOM compositional change, while long type IVa pilin (53.4%) was not (r = 0.61, P = 0.013 for e-pilin; P > 0.05 for long type IVa pilin).


In this study, we demonstrated that DOM molecular distribution correlates with microbial community structure, taxonomy, and functional potential in paddy soils from sites representing gradients of temperature, precipitation, pH, and human activity. Both molecular distribution of DOM and microbial community structure exhibited significant biogeographic patterns. We considered how biotic and abiotic factors, like geographic distance, MMP, pH, MAT, elevation, and the interactions between microbial communities and DOM molecules, might drive the biogeography of microbial communities and the chemogeography of DOM molecules. These results expand our knowledge of how microbes and DOM molecules are distributed throughout the rice paddy ecosystem.

While variations in the type and abundance of paddy soil DOM molecules have typically been chalked-up to temperature, moisture, pH, and mineralogy [15, 2729], here, we demonstrated that the microbiome describes DOM variance to a greater extent than any other edaphic or geographic factor investigated. This result emphasizes the overriding potential impact that biotic function factors can have on constructing the DOM heterogeneity when compared to other abiotic functions in paddy soil. The heterogeneous characteristic of DOM is partly attributed to the chemistry of plant-derived compounds and their decomposition byproducts [15, 30, 31]. For instance, genes associated with ligninase or hydrocarbon (mostly aromatics) degradation were positively correlated with polyphenolic and phenolic compounds and polycyclic aromatics in DOM of flooded paddy soils. This indicated biodegradation processes from large biopolymers (e.g., lignin) towards small molecules with concurrent increases in polar and ionizable groups and thus with increase in solubility [5]. Meanwhile, these compounds (known as terrestrial DOM) characterized by high cyclization were selectively coprecipitated with iron at the redox interface during their upward diffusion together with the ferrous ion [3]. Dissimilatory Fe(III)-reducing microbes, e.g., Geobacter, were responsible for the production of ferrous ion and thus the coprecipitation of terrestrial DOM with Fe(III) in paddy soil [32, 33]. Many of the DOM compounds may also be derived from microbial residues, e.g., cellular materials and extracellular secretions [34, 35]. Recently, SOM chemistry study in model soils also demonstrated that the accumulation of chemically diverse SOM was driven by distinct microbial communities, per se, rather than the substrates they utilized [36].

Conversely, we also realized that the abundance and distribution of DOM explained variations in the paddy soil microbiome better than any other examined edaphic or geographic factors. The chemical nature of SOM was reported to affect the structure and functioning of paddy soil microbiota [6, 37]. In our study, more biodegradable substances, e.g., peptides and carbohydrates, favored microbes (e.g., Geobacter) utilizing mainly simple C forms and functions regarding glycolysis (EMP) and TCA cycle in the flooded paddy. This heightened concentration of biodegradable substances also facilitated pyruvate fermentation, and associated metabolic processes, e.g. methanogenesis and homoacetogenesis [38]. Microbial consortia that cooperatively exchange electrons were pivotal in the anaerobic processing of SOM [25, 26]. Therefore, the increase of Fe(III)-reducing bacteria encoding e-pilin may promote the propagation of hydrogenotrophic methanogens. Consistently, we found that gene encoding e-pilin, as a potential indicator of the DIET [39, 40], was significantly correlated with DOM variance. Moreover, a recent study proved that the aceticlastic methanogen Methanothrix spp. receives electrons for the reduction of CO2 via DIET from microorganisms expressing e-pilin genes, e.g., Geobacter [39]. Decreased abundance of DOM compounds with quinone moieties, which were likely derived from polyphenolic and phenolic compounds and polycyclic aromatics, reduced the possibility that electrons are being transferred to these DOM and used in Fe(III) reduction [41, 42], thus increasing the electrons available for methanogenesis.

Of the abiotic factors, precipitation (MMP), temperature (MAT), and pH explained the majority of the main variance in DOM composition across the continental scale paddy fields. pH and temperature have also been reported to drive soil microbial community composition at continental scales [10, 11]. Here, we have found that temperature (MAT) and pH were significantly correlated with the covariations between DOM and microbial community. Therefore, the influences of pH and temperature on DOM diversity may be mediated by microbial community. Consistently, pH was not strongly correlated with the majority of individual DOM molecules (Additional file 1: Figure S3), but it could explain the DOM component variance, implying an indirect effect on DOM molecular distribution. Elevated temperature may stimulate the biodegradation of plant residues, but the consequences for microbial-derived residues are less clear [11, 36].

Precipitation shapes DOM chemodiversity as it enhances upward movement of ferrous ions and DOM molecules from deep soils, and influences the influx of selective DOM from the surface soil. Precipitation events dilute not only the ion and DOM concentrations of the standing water per se, but also that of the irrigation water sources (i.e., the nearby bodies of water), enhancing the upward movement of Fe(II) and DOM from deep soils [43]. Meanwhile, terrestrial DOM would be selectively trapped at the Ap horizon (oxic and partly oxic) via coagulation with the precipitating Fe(III) during the upward diffusion [1, 3, 44], which then accelerates the upward diffusion of these compounds. Moreover, terrestrial DOM is relatively harder to be regenerated when compared to carbohydrate and peptide [45]. Besides, abundant rainfall preserves the gradient in reduction potential across depth, which increases not only the mineral reduction but also the opportunity of DOM reduction in paddy fields [1, 6, 43]. As evidence, significant correlations were found between precipitation (MMP) and genes encoding caa3-type and bo-type cytochrome oxidases (Fig. 5b) and cellular response to oxidative stress (Additional file 7: Table S5). The caa3-type and bo-type cytochrome oxidases are mostly found in aerobes, while cbb3-type and bd-type cytochrome oxidases have been reported to be utilized by anaerobes or microaerophiles in microaerobic energy metabolism [46, 47]. In this case, precipitation may also shape microbial community.

Factors like parential metrial, redox state, fertilization level, pesticide application, mineralogy, rice cultivar, and growth stage may also influence the geography of the microbial community and DOM [29, 4853]. In submerged paddy soil, rice aerenchyma enables the transport of atmospheric O2 to the roots [54], influencing soil redox states [55]; moreover, rice straw and stubble are assumed to provide substrates for microbial activity in the early growth stage, while exudates become more important during late tillering and ripening [56, 57]. Therefore, the effect of the rice plant on the soil microbial community largely depends on the plants growth stage. In this study, the tiller number of the rice (indicating the growth stage) was significantly correlated with DOM variance (PCo1) and DOM features (Fig. 3e). Recently, a relatively comprehensive study on the DOM chemodiversity of paddy soil and factors, including mineral elements of Fe, Mn, Al, Mg, Ca, and Si, demonstrated that the iron complexing index (Fep/FeR), together with pH and C/N ratio, were key factors controlling DOM profiles [29]. Another study on agricultural, meadow, and forest soils revealed that pH and nitrate significantly affect the chemical composition of DOM molecules [14]. These factors were also separately found to significantly correlate with the microbiome in paddy soils [48, 58, 59]. Although rice cultivation management dominated the microbial community assembly in paddy soils [60], the soil parent material was also influential [58], and hence, the influence on DOM distribution should be further investigated. Although these previous studies and the current one presented here principally focus on the spatial distribution of the microbiome and DOM molecules, how these patterns change over time is also important [61]. Redox potential is one of the key temporal factors and is controlled by soil ventilation [1]. Temporal changes of irrigation management, precipitation, and even light intensity may quickly change the redox condition in soils. Researchers have revealed that redox potential could significantly shift microbial community composition [6, 53]. Retention of certain DOM molecules by soil minerals and their subsequent stabilization against microbial decay were also largely dependent on the redox state [3, 5]. However, it remains unknown whether the temporal dynamics of microbial communities correlates shifts in DOM composition. Therefore, there is a continued need for new, well-controlled studies to further elucidate DOM chemogeography and microbial biogeography.


Understanding the relationship between soil DOM and microbial community structure and function remains a research goal for biogeochemists, especially at the molecular level [8, 52]. Here, we integrated mass spectra and genomics data to characterize the association between DOM molecular distribution and microbial diversity and applied gene-centric analysis to elucidate the microbial metabolic potential that responds and shapes DOM heterogeneity. DOM chemodiversity was significantly and broadly correlated with the taxonomy and functional potential of the microbial community in paddy soil. Besides pH and temperature, precipitation was also found to be a potential factor of microbial community and DOM chemical distribution. These findings are foundational, but could be of great importance for environmental and agricultural management in paddy soils.


Site selection and soil sampling

Soil samples were collected from 88 flooded paddy sites across four typical Chinese rice-growing regions in 2014 and 2015 (Fig. 1). Most of the soils were sampled during the tillering phase of rice plants in the paddy fields. Among 88 sampling sites, there were 23 from Hani Terrace, 24 from Sanjiang Plain, 18 from Lianghu Plain, and 23 from Taihu Plain (Table 1). At each site, soil cores (2.5 cm diameter by 15 cm depth) were sieved (2 mm) and homogenized, and plant materials were removed immediately before sealing and transportation. For more details about soil sampling and characteristics measurements (Additional file 10: Table S8), please see supplementary methods in Additional file 11.

FT-ICR-MS data analysis

FT-ICR-MS samples were prepared and measured according to Kellerman et al. [4] with some modifications (for details see supplementary methods in Additional file 11). Detected peaks were considered if the signal-to-noise ratio was greater than five. After calibration, different spectral peaks were clustered into operational units within a mass tolerance with m/z difference ratios less than 1 × 10−6. The detailed methods for calibration and clustering are described in Additional file 11. Clusters with fewer than ten peaks were not considered for further annotation.

Based on the two mandatory and two optional steps for peak annotation by Koch et al. [16], we introduced a carbon isotope ratio-based molecular annotation approach (Additional file 1: Figure S5), in which molecular formulae are assigned to peaks according to stringent criteria with elemental combinations of C1-100H1-150O0-50N0-4P0-1S0-1. The isotope-based approach first tries to find the carbon isotope peak of a certain large peak according to mass differences and then calculates the potential C number in the molecular formula based on the relatively stable ratio of naturally occurring 13C-isotope to 12C (i.e., 1.07%) and the intensity ratio of the two peaks. At the same time, several alternative formulas are calculated according to an a priori definition of elements and unequivocal exclusion criteria. Then, the carbon numbers of these formulas are subtracted by the potential C number for carbon number differences (Cdev), and the molecular formula with the smallest Cdev is chosen. In this study, Cdev was defined as acceptable when the Cdev was (− 3, 1) [16]. Annotated formulas were then used as a scaffold, and a “chemical building block” approach was adopted to annotate the rest of the peaks. Meanwhile, the relevant 13C, 15N, 34S, 33S, 18O, 17O, 2H, 13C2, 13C3, and 13C34S isotope molecules were all determined, if detected.

The annotated molecules were assigned to compound categories based on the stoichiometry of their molecular formulas [45]: polycyclic aromatics (aromaticity index, AI > 0.66) [62], polyphenols and polycyclic aromatics with aliphatic chains (0.66 ≥ AI > 0.50), phenolic and highly unsaturated/aromatic compounds (AI ≤ 0.50 and H/C ≤ 1.5), unsaturated aliphatics and aromatics with aliphatic chains (N = 0, AI ≤ 0.50, 2.0 ≥ H/C ≥ 1.5), saturated fatty, sulfonic acids and carbohydrates (H/C ≥ 2.0 or O/C ≥ 0.9), and N-containing compounds, i.e., peptides (N ≥ 1, AI ≤0.50, 2.0 ≥ H/C ≥ 1.5).

16S rRNA sequencing and analysis

DNA was extracted using the MoBio PowerSoil DNA extraction kit following the manufacturer’s protocol. The concentration and qualification of the total DNA were examined by electrophoresis on 1% agarose gels, which was diluted 1 ng/μL using sterile water before downstream processing. PCR procedure was carried out as described previously [63]. Briefly, the V4-V5 region of the 16S rRNA gene from each soil sample was amplified using the F515/R907 primer set with a unique 6-nt barcode at the 5′ of the forward primer [64]. All PCR reactions were carried out with Phusion® High-Fidelity PCR Master Mix (New England Biolabs). After electrophoresis on 2% agarose gel, PCR products with bright main strip between 400 and 450 bp were mixed in equidensity ratios and purified with Qiagen Gel Extraction Kit (Qiagen, Germany). Sequencing libraries were generated using the TruSeq® DNA PCR-Free Sample Preparation Kit (Illumina, USA) following the manufacturer’s recommendations, and index codes were added. After quality assessing, the libraries were sequenced on an Illumina HiSeq2500 (Novogene, China), and 250 bp paired-end reads were generated.

The raw sequence data were processed using the QIIME v1.9.1 pipeline [65]. Firstly, the forward and reverse Illumina reads were joined using the default setting. Then, the multi-lane fastq data were demultiplexed and quality filtered (Q30 ≥ 75% and Q20 = 100%). Chimeras were identified using “” with “-m usearch61” and then removed. A total of 15,339,665 reads were kept with a number over 43,000 for each sample. Filtered sequences were clustered into operational taxonomic units (OTUs) using the function “” against the SILVA 119 database [66], based on a 97% consensus threshold. Then, the singletons were removed and taxonomy was assigned using the RDP classifier against the SILVA database. R package Tax4fun [24] with UProC long read mode was used to predict the functional capabilities of microbial community in each sample based on assigning OTUs of 16S rRNA gene to the reference sequences in the SILVA (version 119, 97 set) database via SortMeRNA [67]. HUMAnN2 [68] was used to map the resulted orthologs to functional ontology of FOAM database [69].

Shotgun sequencing, metagenome assembly, and annotation

A sum of 16 samples (4 samples from each region) was randomly selected for shotgun metagenome sequencing. Total DNA was extracted using the same method as that of 16S rRNA sequencing. DNA concentration was measured using Qubit® dsDNA Assay Kit in Qubit® 2.0 Flurometer (Life Technologies, USA). Using 1 μg DNA per sample as input material, sequencing libraries were generated using NEB Next® Ultra™ DNA Library Prep Kit (NEB, USA), and index codes were added to attribute sequences to each sample. Briefly, the DNA was broken into 350 bp fragments using sonication, polished and extracted using the AMPure XP system. Libraries were prepared on a cBot Cluster Generation System following the manufacturer’s instructions. After cluster generation, the library preparations were sequenced on an Illumina HiSeq platform (Novogene, China) and 150 bp paired-end reads were generated.

Paired-end reads were quality controlled using Readfq v8 ( sequences with more than 40 bases, with quality score lower than 38, or with N bases more than 10 were filtered. The adapter was also removed from the sequences. Metagenome sequencing yielded about 12.8 G clean bases per sample (Additional file 12: Table S9). Taxonomical classification of the sequencing reads of each sample was performed using Kaiju [70] with greedy-5 mode against an nr-derived database including proteins from archaea, bacteria, viruses, fungi, and microbial eukaryotes. Based on the classification result, relative abundances of Fe(III)-reducing bacteria with or without e-pili were calculated (Additional file 9: Table S7). The 93 known Fe(III)-reducing microorganisms for which genomes are available were obtained from a previous study [39]. Metagenome assemblies were conducted for each sample using MEGAHIT v1.1.1 [71] with “--presets meta-large.” Metagenome assemblies yielded a total of 41.8 M contigs over 200 bases length and 10.1 M contigs over 500 bases length (average length was 458 bases; Additional file 13: Table S10). A total of 53.5 M nucleotide sequences or protein translations of genes were predicted from the contigs (≥ 200 bp) of each sample using prodigal [72] with “-p meta.” These genes or proteins were then pooled together to form a gene catalog and a protein catalog, respectively. The protein catalog was annotated using the FOAM database [69] and hmmer3.1 [73] to obtain function orthologs as defined in KEGG Orthology [74]. We chose profile’s trusted cutoffs to set all thresholding. The resulted orthologs were then mapped to an associated functional ontology of FOAM database [69] to describe the functional groups and hierarchy. Metagenomic contigs were annotated to gene and enzyme using prokka pipeline [75]. The marker genes used in the analyses of biogeochemical functions were selected from a hidden markov model database [76] with a few modifications. Pfam [77] and TIGRfam [78] protein families were assigned using hmmer3.1 [79]. Profile’s trusted cutoffs were used as thresholds. The presence of type IV pilA genes was estimated by assigning the gene catalog to nucleotide sequence database with 33 e-pilin genes and 27 long pilin genes [40] using Diamond [80] with parameters set as following: --more-sensitive, -e 0.00001, -l 20. For each gene, the best blast hit (one HSP > 60 bits) result was selected for downstream analyses. To quantify the annotated genes in each sample, we mapped the paired-end reads back to the assemblies according to the pipeline described here:, together with the other tools, i.e., bowtie2 [81], samtools [82], and htseq [83]. As suggested by the pipeline, we used the TPM (Transcripts Per Kilobase Million) method [84] to normalize abundance values in metagenomes. The gene encoding for acetyl-CoA synthetase (K01895) is normally multi-copy, so was not chosen for the comparison.

DOM and bacterial diversity calculations and multivariate analysis

Accumulation and rank abundance curves of DOM were calculated with the sum-normalized intensity of non-singleton data using R package Biodiversity R [85] and vegan [86]. Bray-Curtis dissimilarity was used to compute the sparse matrices of DOM molecules and bacterial community. The molecular and bacterial alpha diversities and beta diversities were calculated using QIIME v1.9.1 [65]. VAW-UniFrac dissimilarity was calculated using R package GUniFrac [87]. Modified R function kde2d weighted from kde2d in MASS package [88] was used to perform two-dimensional kernel density estimation with an axis-aligned bivariate normal kernel in van Krevelen diagram for the density of DOM molecules in each sampling region. Median values of molecular abundances were defined as the weights parameter. Mantel test was conducted to determine whether two distance matrixes were significantly correlated. PERMANOVA test was conducted to determine whether DOM molecular Bray-Curtis dissimilarity was significantly different between regions. CCA and NMDS were performed using rounded intensities rarefied at the depth of 43,000 for both DOM and bacterial communities (or their dissimilarity matrices). Partial CCA was used to calculate the independent influences of different categories or parameters on DOM or microbial variance. Principal coordinates analysis (PCoA) was used to calculate the gradient in compositional changes of microbial community (based on Bray-Curtis or VAW-UniFrac), DOM (based on Bray-Curtis), and different DOM categories (based on Bray-Curtis). Procrustes rotation and Monte Carlo permutation test (permutation = 9999) were performed using the two coordinate matrices (output of PCoA) based on their Bray-Curtis dissimilarities. We performed the test using the first ten axes of DOM and bacterial coordinate matrices (explained 67% and 65% variations, respectively). CCorA was conducted using PCo1–5 axes of DOM composition and PCo1–5 axes of microbial community composition or PCo1–2 axes of functional orthologs distance matrices (Bray-Curtis dissimilarities) as imports following the procedure described by Osterholz et al. [19]. These analyses were all conducted using vegan package [86].




Aromaticity index


Canonical correspondence analysis


Canonical correlation analysis


Direct interspecies electron transfer


Dissolved organic matter




Electrically conductive pili


Electrically conductive pilin


Functional Ontology Assignments for Metagenomes


Fourier transform ion cyclotron resonance mass spectrometry


Mean annual temperature


Mean monthly precipitation


Non-metric multidimensional scaling


Operational taxonomic unit


Principal coordinate


Principal coordinate analysis


Permutational multivariate analysis of variance


Soil organic matter


Tricarboxylic acid


Transcripts Per Kilobase Million


Variance adjusted weighted UniFrac



We thank L.J. Zhang, F. Ping, X. Z. Sun, and S. Y. Feng for their assistance in sampling and physiological data analysis.


This work was supported by the National Natural Science Foundation of China (41673081, 41373074, 41877346 and 31500409) and Zhejiang Science and Technology Innovation Program (2013C33001). This work was supported in part by the U.S. Department of Energy under Contract DE-AC02-06CH11357.

Availability of data and materials

The raw Illumine sequence data of 16S rRNA have been deposited in the sequence read archive (SRA accession: SRP5500567–654) at NCBI under Bioproject accession #PRJNA385062. The raw Illumine sequence data of metagenomic data are also available (SRA accession SRX2786101–16) in the NCBI under Bioproject accession #PRJNA385547.

Authors’ contributions

ZZ, JAG, YZ, and HL conceived and designed the project. HL, Hang W, CT, CJ, and ZZ collected and processed all field samples. XP and JC conducted FT-ICR-MS. HL analyzed the data and generated the first draft of the manuscript before JAG, Haitao W, ZZ, Hang W, WA, LC and other co-authors contributed by commenting on and revising it. All authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

College of Environment and Natural Resource Sciences, Zhejiang University, 866 Yuhangtang Ave, Hangzhou, 310058, China
National Plateau Wetlands Research Center, Southwest Forestry University, 300 Bailongsi, Kunming, 650224, China
The Microbiome Center, Biosciences Division, Argonne National Laboratory, Lemont, IL 60439, USA
Department of Surgery, University of Chicago, 5640 South Ellis Avenue, Chicago, IL 60637, USA
National Center of Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, West Beichen Road, Chaoyang District, Beijing, 100101, China
College of Biological and Environmental Engineering, Zhejiang University of Technology, 18 Chaowang Ave, Hangzhou, 310014, China
Hangzhou Gusheng Agricultural Technology Company Limited, Chongxian Innovation Industrial Park, Chongxian Ave, Hangzhou, 311108, China
Key Lab of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, 1799 Jimei Ave, Xiamen, 361021, China
China Academy of West Region Development, Zhejiang University, 866 Yuhangtang Ave, Hangzhou, 310058, China


  1. Kögel-Knabner I, Amelung W, Cao Z, Fiedler S, Frenzel P, Jahn R, et al. Biogeochemistry of paddy soils. Geoderma. 2010;157:1–14.View ArticleGoogle Scholar
  2. Bouman M, Lampayan RM, Tuong TP. Water management in irrigated rice: coping with water scarcity. Los Baños: International Rice Research Institute; 2013. p. 3–4.Google Scholar
  3. Riedel T, Zak D, Biester H, Dittmar T. Iron traps terrestrially derived dissolved organic matter at redox interfaces. Proc Natl Acad Sci U S A. 2013;110:10101–5.View ArticleGoogle Scholar
  4. Kellerman AM, Dittmar T, Kothawala DN, Tranvik LJ. Chemodiversity of dissolved organic matter in lakes driven by climate and hydrology. Nat Commun. 2014;5:3804.View ArticleGoogle Scholar
  5. Lehmann J, Kleber M. The contentious nature of soil organic matter. Nature. 2015;528:60.View ArticleGoogle Scholar
  6. Liesack W, Schnell S, Revsbech NP. Microbiology of flooded rice paddies. FEMS Microbiol Rev. 2000;24:625.View ArticleGoogle Scholar
  7. Žifčáková L, Větrovský T, Lombard V, Henrissat B, Howe A, Baldrian P. Feed in summer, rest in winter: microbial carbon utilization in forest topsoil. Microbiome. 2017;5:122.View ArticleGoogle Scholar
  8. Kalbitz K, Solinger S, Park JH, Michalzik B, Matzner E. Controls on the dynamics of dissolved organic matter in soils: a review. Soil Sci. 2000;165:277–304.View ArticleGoogle Scholar
  9. Hanson CA, Fuhrman JA, Hornerdevine MC, Martiny JB. Beyond biogeographic patterns: processes shaping the microbial landscape. Nat Rev Microbiol. 2012;10:497–506.View ArticleGoogle Scholar
  10. Fierer N, Jackson RB. The diversity and biogeography of soil bacterial communities. Proc Natl Acad Sci U S A. 2006;103:626–31.View ArticleGoogle Scholar
  11. Zhou J, Ye D, Shen L, Wen C, Yan Q, Ning D, et al. Temperature mediates continental-scale diversity of microbes in forest soils. Nat Commun. 2016;7:12083.View ArticleGoogle Scholar
  12. Frank S, Tiemeyer B, Bechtold M, Lücke A, Bol R. Effect of past peat cultivation practices on present dynamics of dissolved organic carbon. Sci Total Environ. 2016;574:1243–53.View ArticleGoogle Scholar
  13. Nebbioso A, Piccolo A. Molecular characterization of dissolved organic matter (DOM): a critical review. Anal Bioanal Chem. 2013;405:109–24.View ArticleGoogle Scholar
  14. Seifert AG, Roth VN, Dittmar T, Gleixner G, Breuer L, Houska T, et al. Comparing molecular composition of dissolved organic matter in soil and stream water: influence of land use and chemical characteristics. Sci Total Environ. 2016;571:142–52.View ArticleGoogle Scholar
  15. Roth VN, Dittmar T, Gaupp R, Gleixner G. The molecular composition of dissolved organic matter in forest soils as a function of ph and temperature. PLoS One. 2015;10:e0119188.View ArticleGoogle Scholar
  16. Koch BP, Dittmar T, Witt M, Kattner G. Fundamentals of molecular formula assignment to ultrahigh resolution mass data of natural organic matter. Anal Chem. 2007;79:1758–63.View ArticleGoogle Scholar
  17. Kujawinski EB, Vecchio RD, Blough NV, Klein GC, Marshall AG. Probing molecular-level transformations of dissolved organic matter: insights on photochemical degradation and protozoan modification of DOM from electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. Mar Chem. 2004;92:23–37.View ArticleGoogle Scholar
  18. Kujawinski EB, Longnecker K, Barott KL, Weber RJM, Soule MCK. Microbial community structure affects marine dissolved organic matter composition. Front Mar Sci. 2016;3:45.View ArticleGoogle Scholar
  19. Osterholz H, Singer G, Wemheuer B, Daniel R, Simon M, Niggemann J, et al. Deciphering associations between dissolved organic molecules and bacterial communities in a pelagic marine system. ISME J. 2016;10:1717.View ArticleGoogle Scholar
  20. Gonsior M, Valle J, Schmitt-Kopplin P, Hertkorn N, Bastviken D, Luek J, et al. Chemodiversity of dissolved organic matter in the amazon basin. Biogeosci Discuss. 2016;13:1–21.View ArticleGoogle Scholar
  21. Stegen JC, Fredrickson JK, Wilkins MJ, Konopka AE, Nelson WC, Arntzen EV, et al. Groundwater-surface water mixing shifts ecological assembly processes and stimulates organic carbon turnover. Nat Commun. 2016;7:11237.View ArticleGoogle Scholar
  22. Ohno T, Sleighter RL, Hatcher PG. Comparative study of organic matter chemical characterization using negative and positive mode electrospray ionization ultrahigh-resolution mass spectrometry. Anal Bioanal Chem. 2016;408:2497.View ArticleGoogle Scholar
  23. Bao SD. Agro-chemical analysis of soil. 3rd ed. Beijing: China Agricultural Press; 2000. p. 71–87.Google Scholar
  24. Aßhauer KP, Wemheuer B, Daniel R, Meinicke P. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics. 2015;31:2882–4.View ArticleGoogle Scholar
  25. Reguera G, McCarthy KD, Mehta T, Nicoll JS, Tuominen MT, Lovley DR. Extracellular electron transfer via microbial nanowires. Nature. 2005;71:1098–101.View ArticleGoogle Scholar
  26. Summers ZM, Fogarty HE, Leang C, Franks AE, Malvankar NS, Lovley DR. Direct exchange of electrons within aggregates of an evolved syntrophic coculture of anaerobic bacteria. Science. 2010;330:1413–5.View ArticleGoogle Scholar
  27. Wang H, Holden J, Zhang ZJ, Li M, Li X. Concentration dynamics and biodegradability of dissolved organic matter in wetland soils subjected to experimental warming. Sci Total Environ. 2014;470-471:907–16.View ArticleGoogle Scholar
  28. Schiff S, Aravena R, Mewhinney E, Elgood R, Warner B, Dillon P, et al. Precambrian shield wetlands: hydrologic control of the sources and export of dissolved organic matter. Clim Chang. 1998;40:167–88.View ArticleGoogle Scholar
  29. Li X, Sun GX, Chen S, Fang Z, Yuan HY, Shi Q, et al. Molecular chemodiversity of dissolved organic matter in paddy soils. Environ Sci Technol. 2018;52:963–71.View ArticleGoogle Scholar
  30. Monreal CM, Schulten HR, Kodama H. Age, turnover and molecular diversity of soil organic matter in aggregates of a Gleysol. Can J Soil Sci. 1997;77:379–88.View ArticleGoogle Scholar
  31. Schmidt SK, Reed SC, Nemergut DR, Grandy AS, Cleveland CC, Weintraub MN, et al. The earliest stages of ecosystem succession in high-elevation (5000 metres above sea level), recently deglaciated soils. Proc Biol Sci. 2008;275:2793–802.View ArticleGoogle Scholar
  32. Hori T, Müller A, Igarashi Y, Conrad R, Friedrich MW. Identification of iron-reducing microorganisms in anoxic rice paddy soil by 13C-acetate probing. ISME J. 2010;4:267–78.View ArticleGoogle Scholar
  33. Lovley DR, Ueki T, Zhang T, Malvankar NS, Shrestha PM, Flanagan KA. Geobacter: the microbe electric’s physiology, ecology, and practical applications. Adv Microb Physiol. 2011;59:1–100.View ArticleGoogle Scholar
  34. Kogel-Knabner I. The macromolecular organic composition of plant and microbial residues as inputs to soil organic matter. Soil Biol Biochem. 2002;34:139–62.View ArticleGoogle Scholar
  35. Golchin A, Clarke P, Oades JM. The heterogeneous nature of microbial products as shown by solid-state 13C CP/MAS NMR spectroscopy. Biogeochemistry. 1996;34:71–97.View ArticleGoogle Scholar
  36. Kallenbach CM, Frey SD, Grandy AS. Direct evidence for microbial-derived soil organic matter formation and its ecophysiological controls. Nat Commun. 2016;7:13630.View ArticleGoogle Scholar
  37. Ng EL, Patti AF, Rose MT, Schefe CR, Wilkinson K, Smernik RJ, et al. Does the chemical nature of soil carbon drive the structure and functioning of soil microbial communities? Soil Biol Biochem. 2014;70:54–61.View ArticleGoogle Scholar
  38. Conrad R, Klose M. Dynamics of the methanogenic archaeal community in anoxic rice soil upon addition of straw. Eur J Soil Sci. 2006;57:476–84.View ArticleGoogle Scholar
  39. Holmes DE, Dang Y, Walker DJ, Lovley DR. The electrically conductive pili of Geobacter species are a recently evolved feature for extracellular electron transfer. Microb Genom. 2016;2:e000072.PubMedPubMed CentralGoogle Scholar
  40. Holmes DE, Shrestha PM, Walker DJ, Dang Y, Nevin KP, Woodard TL, et al. Metatranscriptomic evidence for direct interspecies electron transfer between Geobacter and Methanothrix species in methanogenic rice paddy soils. Appl Environ Microbiol. 2017;83:AEM.00223–17.View ArticleGoogle Scholar
  41. Lovley D, Woodard JC, Philips EJP, Bluntharris EL, Coates JD. Humic substances as electron acceptors for microbial respiration. Nature. 1996;382:445–8.View ArticleGoogle Scholar
  42. Tan Y, Adhikari RY, Malvankar NS, Ward JE, Nevin KP, Woodard TL, et al. The low conductivity of geobacter uraniireducens pili suggests a diversity of extracellular electron transfer mechanisms in the genus Geobacter. Front Microbiol. 2016;7:1.Google Scholar
  43. Ratering S, Schnell S. Localization of iron-reducing activity in paddy soil by profile studies. Biogeochemistry. 1998;3:341–57.Google Scholar
  44. Jahn R, Blume HP, Asio VB, Spaargaren O, Schad P. Guidelines for soil description. 4th ed. Rome: Food and Agriculture Organization of the United Nations; 2006. p. 67–77.Google Scholar
  45. Šantltemkiv T, Kai F, Dittmar T, Hansen BM, Thyrhaug R, Nielsen NW, et al. Hailstones: a window into the microbial and chemical inventory of a storm cloud. PLoS One. 2013;8:e53550.View ArticleGoogle Scholar
  46. Baughn AD, Malamy MH. The strict anaerobe Bacteroides fragilis grows in and benefits from nanomolar concentrations of oxygen. Nature. 2004;427:441–4.View ArticleGoogle Scholar
  47. Pitcher RS, Brittain T, Watmough NJ. Cytochrome cbb(3) oxidase and bacterial microaerobic metabolism. Biochem Soc Trans. 2002;30:653–8.View ArticleGoogle Scholar
  48. Dong WY, Zhang XY, Dai XQ, Fu XL, Yang FT, Liu XY, et al. Changes in soil microbial community composition in response to fertilization of paddy soils in subtropical China. Appl Soil Ecol. 2014;84:140–7.View ArticleGoogle Scholar
  49. Johnsen K, Jacobsen CS, Torsvik V, Sorenson AJ. Pesticide effects of bacterial diversity in agricultural soils--a review. Biol Fert Soils. 2001;33:443–53.View ArticleGoogle Scholar
  50. Breidenbach B, Conrad R. Seasonal dynamics of bacterial and archaeal methanogenic communities in flooded rice fields and effect of drainage. Front Microbiol. 2014;5:752.PubMedGoogle Scholar
  51. Sun Y, Huang S, Yu X, Zhang W. Differences in fertilization impacts on organic carbon content and stability in a paddy and an upland soil in subtropical China. Plant Soil. 2015;397:1–12.View ArticleGoogle Scholar
  52. Said-Pullicino D, Miniotti EF, Sodano M, Bertora C, Lerda C, Chiaradia EA, et al. Linking dissolved organic carbon cycling to organic carbon fluxes in rice paddies under different water management practices. Plant Soil. 2016;401:273–90.View ArticleGoogle Scholar
  53. Noll M, Matthies D, Frenzel P, Derakshani M, Liesack W. Succession of bacterial community structure and diversity in a paddy soil oxygen gradient. Environ Microbiol. 2005;7:382–95.View ArticleGoogle Scholar
  54. Begg CBM, Kirk GJD, Mackenzie AF, Neue H. Root-induced iron oxidation and pH changes in the lowland rice rhizosphere. New Phytol. 2010;128:469–77.View ArticleGoogle Scholar
  55. Tyagi L, Verma A, Singh SN. Investigation on temporal variation in methane emission from different rice cultivars under the influence of weeds. Environ Monit Assess. 2004;93:91–101.View ArticleGoogle Scholar
  56. Lynch JM, Whipps JM. Substrate flow in the rhizosphere. Plant Soil. 1990;129:1–10.View ArticleGoogle Scholar
  57. Neue HU, Wassmann R, Lantin RS, Ma CRA, Aduna JB, Javellana AM. Factors affecting methane emission from rice fields. Atmos Environ. 1996;30:1751–4.View ArticleGoogle Scholar
  58. Sheng R, Qin H, O’Donnell AG, Huang S, Wu J, Wei W. Bacterial succession in paddy soils derived from different parent materials. J Soils Sediments. 2015;15:982–92.View ArticleGoogle Scholar
  59. Sun W, Xiao E, Pu Z, Krumins V, Dong Y, Li B, et al. Paddy soil microbial communities driven by environment- and microbe-microbe interactions: a case study of elevation-resolved microbial communities in a rice terrace. Sci Total Environ. 2018;612:884–93.View ArticleGoogle Scholar
  60. Rokunuzzaman M, Ueda Y, Chen L, Tanaka S, Ohnishi K. Effects of land use changes from paddy fields on soil bacterial communities in a hilly and mountainous area. Microbes Environ. 2016;31:160–4.View ArticleGoogle Scholar
  61. Fierer N. Microbial biogeography: patterns in microbial diversity across space and time. In: Zengler K, editor. Accessing uncultivated microorganisms: from the environment to organisms and genomes and Back. Washington DC: ASM Press; 2008. p. 95–115.Google Scholar
  62. Koch BP, Dittmar T. From mass to structure: an aromaticity index for high-resolution mass data of natural organic matter. Rapid Commun Mass Spectrom. 2010;20:926–32.View ArticleGoogle Scholar
  63. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A. 2011;108:4516–22.View ArticleGoogle Scholar
  64. Jing X, Sanders NJ, Shi Y, Chu H, Classen AT, Zhao K, et al. The links between ecosystem multifunctionality and above- and belowground biodiversity are mediated by climate. Nat Commun. 2015;6:8159.View ArticleGoogle Scholar
  65. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6.View ArticleGoogle Scholar
  66. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:590–6.View ArticleGoogle Scholar
  67. Kopylova E. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012;28:3211–7.View ArticleGoogle Scholar
  68. Guio L. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol. 2012;8:e1002358.View ArticleGoogle Scholar
  69. Prestat E, David MM, Hultman J, Taş N, Lamendella R, Dvornik J, et al. FOAM (Functional Ontology Assignments for Metagenomes): a hidden Markov model (HMM) database with environmental focus. Nucleic Acids Res. 2014;42:e145.View ArticleGoogle Scholar
  70. Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun. 2016;7:11257.View ArticleGoogle Scholar
  71. Li D, Luo R, Liu CM, Leung CM, Ting HF, Sadakane K, et al. MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods. 2016;102:3–11.View ArticleGoogle Scholar
  72. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.View ArticleGoogle Scholar
  73. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279.View ArticleGoogle Scholar
  74. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;27:29–34.View ArticleGoogle Scholar
  75. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068.View ArticleGoogle Scholar
  76. Anantharaman K, Brown CT, Hug LA, Sharon I, Castelle CJ, Probst AJ, et al. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat Commun. 2016;7:13219.View ArticleGoogle Scholar
  77. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:222–30.View ArticleGoogle Scholar
  78. Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 2013;41:387–95.View ArticleGoogle Scholar
  79. Mistry J, Finn RD, Eddy SR, Bateman A, Punta M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41:e121.View ArticleGoogle Scholar
  80. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2014;12:59.View ArticleGoogle Scholar
  81. Langmead B. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357.View ArticleGoogle Scholar
  82. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticleGoogle Scholar
  83. Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166.View ArticleGoogle Scholar
  84. Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics. 2010;26:493–500.View ArticleGoogle Scholar
  85. Kindt R, Coe R. Tree diversity analysis. A manual and software for common statistical methods of ecological and biodiversity studies. J Am Vet Med Assoc. 2005;235:68–374.Google Scholar
  86. Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O'Hara RB, et al. Vegan: community ecology package. R package version 2.3-0; 2015.Google Scholar
  87. Chen J, Bittinger K, Charlson ES, Hoffmann C, Lewis J, Wu GD, et al. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics. 2012;28:2106–13.View ArticleGoogle Scholar
  88. Venables W, Ripley B. Modern applied statistics with S-PLUS. Stat Comput. 2002;52:704–5.Google Scholar
  89. Cheng SH, Li J. Modern Chinese rice. 3rd ed. Beijing: JinDun Press; 2007. p. 54–7.Google Scholar
  90. Monfreda C, Ramankutty N, Foley JA. Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000. Glob Biogeochem Cycles. 2008;22:GB1022.View ArticleGoogle Scholar


© The Author(s). 2018