Skip to main content

The allometry of cellular DNA and ribosomal gene content among microbes and its use for the assessment of microbiome community structure



The determination of taxon-specific composition of microbiomes by combining high-throughput sequencing of ribosomal genes with phyloinformatic analyses has become routine in microbiology and allied sciences. Systematic biases to this approach based on the demonstrable variability of ribosomal operon copy number per genome were recognized early. The more recent realization that polyploidy is probably the norm, rather than the exception, among microbes from all domains of life, points to an even larger source bias.


We found that the number of 16S or 18S RNA genes per cell, a combined result of the number of RNA gene loci per genome and ploidy level, follows an allometric power law of cell volume with an exponent of 2/3 across 6 orders of magnitude in small subunit copy number per cell and 9 orders of magnitude in cell size. This stands in contrast to cell DNA content, which follows a power law with an exponent of ¾.


In practical terms, that relationship allows for a single, simple correction for variations in both copy number per genome and ploidy level in ribosomal gene analyses of taxa-specific abundance. In biological terms, it points to the uniqueness of ribosomal gene content among microbial properties that scale with size.

Video Abstract


The rRNA gene approach to microbiome analyses, either based on amplicon or metagenomic sequencing, relies on the tacit assumption that the counts of this marker gene translate into a robust measure or proxy for microbial abundance. However, this assumption is often violated. Sources of error in gene abundance determination can come from analytical procedures such as DNA extraction, PCR amplification, and sequencing itself [1]. But likely as important, systematic biases can be caused by the varying abundance of ribosomal genes in the genomes of microbes [2]. The concern is evident in the dedicated databases that document the variability in ribosomal gene copy number per genome (Rg) among microbes [3]. Interestingly, Rg seems to correlate with a microbe’s life history traits, where fast growth is associated with higher values [4,5,6]. There is also evidence for a certain degree of conservation in Rg within bacterial phylogenetic clades [7]. On this basis, bioinformatic tools have been developed to automatically correct ribosomal gene surveys for Rg [8]. The phylogenetic conservation of Rg, however, seems only conspicuous among closely related microbes [9] and can explain only some 10% of its variability in complex, diverse communities [10]. In some eukaryotes like Saccharomyces cerevisae, Rg is unstable and can vary widely among strains or individuals [11]. Importantly, such corrections would only lead us to a description of community composition in terms of relative abundance of taxon-specific genome copies. But more useful metrics in microbiome community composition analyses are either cell number [7] (i.e., individuals) or biomass contributions by each taxon. Given the close to 9 orders of magnitude spanned by microbial cell biomass, it can be argued that taxon-specific biomass rather than cell number would be a better descriptor of a taxon’s contribution to a community. However, there are still instances where number of cells would be preferred (for example, to gauge dispersal potential, culturability, or susceptibility to deleterious agents like predators or toxicants). In any case, to translate genome numbers to cell numbers, one needs to take into account the level of ploidy, P, the number of copies of the genome present in a cell, where the number of ribosomal operons per cell, Rc, is the product PRg. Surprisingly, P is not typically taken into account, perhaps under the assumption that most microbes, like Escherichia coli, are monoploid [12, 13]. And yet, in bacteria and archaea, P varies far more than Rg [12, 14], and most species examined are oligo- or polyploid, with some containing in excess of 200 genomes copies per cell [15]. If one includes unicellular eukaryotes, the variation can be 4 orders of magnitude [16]. Clearly, ploidy constitutes a very important source of bias for community counts in itself [17], affecting estimates from both amplicon sequencing and shot-gun metagenomics. The variable nature of P could potentially either compound or diminish the effect of Rg variability in determining a cell’s Rc, as it is not known whether P and Rg correlate or vary independently among species; a high P could be associated with low Rg, and vice-versa. Studies on marine protists intended to estimate biomass from 18S counts have shown that Rc correlated linearly with cell volume (Vc) [18] or cell length [19] when plotted on double log scales, indicating an Rc dependence on size.

Here, we posited that perhaps there is constancy among microbes in the need for ribosomal gene content in relation to their cell biomass. In other words, microbial species would be under selection to contain a sufficient but not excessive Rto support the production of their typical cell biomass, Bc, so that Rcwould be proportional to Bc. Assuming cell density to be invariant (around 1.008 g ml−1) [20], Rc would also be proportional to cell volume (Vc).



Values for all parameters were gathered or derived from the literature. In place of Bc, we used cellular volume, Vc, assuming cellular density to be constant (around 1.008 g ml−1 [20]). Cell volumes were either taken from reported direct determinations or derived from literature photomicrographs assuming simple formulae for a variety of fitting three-dimensional shapes (i.e., sphere, cylinder) or combinations thereof as given in Table S1 (see Additional file 1). When a range of volume values was available, we used the average. For Rg, we used values given in rrnDB [3] for the same species or strain. If they were not available, we used literature values or determined it by examination of the strain’s publicly available genome through BLAST. Ploidy was either taken directly from reported values or estimated if cellular DNA content and genome size were known. If P was variable within a species or strain, we used the average level of the range given. Rc values were then derived as the product of P and Rg, although for many protists, Rc was taken directly from experimentally determined values. The annotated input data are gathered in Table S1 (see Additional file 1). The limiting factor to the size of the database was the availability of P determinations, which are quite uncommon. In all, we could analyze 107 cases.


Power fits of data were run in Excel as linear regressions of the ln-transformed data pairs using a least-squares model. Statistics are given in Table S2 (see Additional file 2). To test the significance of exponent differences in two separate datasets, we used T tests for the slopes of the linear fits.

Estimation of taxon-specific cell numbers and biovolumes from 16S rRNA counts

In a dataset of rRNA gene taxon-specific frequencies, Fr, assigned to i taxa whose cell volumes, Vc(i), are known, one can directly estimate Rc (i) from Eq. 1 (see the “Results” section). The relative contribution to number of cells by taxon i, Fc(i), is computed as:

$${F}_{c }\left(i\right)=\frac{{F}_{r}(i)}{{R}_{c}(i)\sum \frac{{F}_{r}\left(i\right)}{{R}_{c}\left(i\right)}}$$

And the relative contribution to biovolume, Fv (i), as:

$${F}_{v }\left(i\right)=\frac{{F}_{c}(i){ V}_{c}(i)}{\sum {F}_{c}(i){ V}_{c}(i)}$$

If a determination of the absolute abundance of the total copies of the ribosomal gene for all taxa considered in the sample of origin, Rs, is available (from qPCR, for example, in units of copies per mass, volume or surface sampled), then absolute taxon-specific assignments R(i) can be obtained as the product \({F}_{r}(i)Rs(i).\) From R(i), one can derive absolute values for cells C(i) and biovolume V(i) attributable to each taxon: C(i) = R(i)/Rc (i) and V(i) = C(i)Vc(i). The sums \(\sum C(i)\) and \(\sum V(i)\) estimate the absolute number of cells or biovolume (in µm3), respectively, of the entire set of taxa under consideration.

An alternative to using Vc(i), if those are not exactly known, is to assign rough discrete size ranges to taxa, and to use mean Vc (and Rc) values of the range’s maximum and minimum. We found it advisable to set variable-width size ranges in such a way that within-range variation in resulting Rc values was kept moderate. We used the following cell diameter ranges (in µm): 0.2–0.3, 0.3–0.4, 0.4–0.6, 0.6–0.9, 0.9–1.2, 1.2–1.5, 1.5–2.1, 2.1–2.9, 2.9–4.1, 4.1–5.8, 5.8–8.2, 8.2–11.6, and 11.6–16.4. This set provides within-range variation in Rc of less than 8% in all cases, which is smaller than the uncertainty of our estimates for the normalization constant in Eq. 1 of the “Results” section.


Traits that span orders of magnitude are best evaluated as double logarithmic plots, which can be analyzed by power function fits. In this approach, the hypothesis of proportionality between Vc and Rc we posed should have resulted in a power function fit with an exponent close to unity. Our analysis (Fig. 1) readily dispelled that contention. The fit instead revealed that Rc follows well (R2 = 0.86) a power function of Vwith an exponent significantly lower than unity, and indistinguishable from 2/3 (0.66 ± 0.03; ± SE) across nine orders of magnitude in cell volume. For volumes expressed in µm3,

Fig. 1
figure 1

Relationship between cellular ribosomal gene content (Rc) and cell volume (Vc) in microbes (n = 107), plotted as a log/log graph. The grey line is a power fit with the equation displayed in red type (fit statistics are in Table S2, Additional file 2). Data points belonging to eukaryotes are in orange, those for archaea in yellow, and bacteria in green. For three species, we plotted datasets to highlight intraspecies variability: Synechococcus elongatus (light blue symbols) [28], Colozoum pelagicum (light purple) [19], and Sphaerozoum fuscum [19] (light yellow)


where 9.58 ± 1.21 is the estimated normalization constant. One could envision that the scaling relationship may have been artifactually distorted at the low range of Rc, since it cannot physiologically take values < 1. But a reanalysis of the dataset excluding data pairs with Rc \(\le 2\) did not change the fit significantly in exponent or normalization constant (see Additional file 2). We also tested the hypothesis that exponents for a fit of data pairs from eukaryotes (exponent = 0.72 ± 0.05) vs. prokaryotes (0.62 ± 0.05) could be different, but this did not find strong statistical support in a T test comparison (p = 0.20).

Equation 1 can be rewritten as a function of linear cell dimensions using a spherical-equivalent cell diameter, \({D}_{c }^{0}=2 \sqrt[3]{\frac{3 {V}_{c}}{4\pi }}\) so that.


Thus, Rc scales generally not with the volume but with the surface area of a microbial cell, which for the purpose of this study means that the bias associated with ribosomal gene counts will be size-dependent regardless of our choice of abundance estimator. Ribosomal counts will overestimate large-celled microbes over small-celled ones if one is interested in number of cells, the bias increasing with the square of linear cell size (Eq. 2), a prediction that finds experimental support for specific cases in the literature [21]. In terms of biomass, ribosomal counts will underestimate the contribution of large microorganisms, the bias increasing with the 2/3 power of cellular biovolume (Eq. 1). Whichever the desired measure of abundance, however, the explicit relationship in Fig. 1 provides a means for bias correction in tallies of ribosomal genes, as long as cell biovolume is known from ancillary data for the taxa detected in the microbiome of interest. The correction requires knowledge of neither P nor Rg.

A procedural explanation is given under the “Methods” section, and we provide an example application in Fig. 2 using a dataset of phototrophic bacteria from endolithic microbiomes within intertidal hard carbonate rocks [22], responsible for their micritization and bioerosion [23], and useful here because typical cell volumes could be assigned to all taxa. The differential outcomes are obvious: 16S rRNA counts of large-celled cyanobacterial genera severely underestimate their contribution to biomass but overestimate their contribution in terms of number of cells (see for example, Hyella sp.). The opposite is true for alphaproteobacterial phototrophs (see for example Rhodomicrobium sp.), most of which are small-celled [24]. The distortion is less intense for the Chloroflexi, with intermediate cell size (see Roseiflexus castenholzii, for example).

Fig. 2
figure 2

Estimation of microbial community structure based on experimental ribosomal counts (central column), estimated cell number (left column) and estimated biovolume (right column) in a single, exemplary dataset using allometric corrections based on Eq. 1. The dataset is from Roush et al. [22] and includes the subset of taxonomically assignable phototrophic bacteria from an endolithic microbiome on coastal marine carbonate rocks. Only three exemplary phototrophs are labeled, but full, taxonomically explicit distributional data are in Table S3 (see Additional file 3). For ease of comparison, results are graphically presented as relative frequencies, but absolute scales of areal abundance are indicated on the arrow to the right

We have presented the issue of bias having in mind relative abundance tallies of microbiome members, but proportional tallies have methodological constraints in themselves, because the individual proportions must add up to 1, and thus the relative abundances of taxa are necessarily not independent of each other. There is clear evidence of severely diverging analytical outcomes when both relative and absolute abundance are compared in the same datasets [25, 26]. Commonly, relative proportions or taxa-specific ribosomal copies are converted to absolute abundances with parallel quantification of rRNA gene copies by qPCR, either total copies in the community analyzed or those of particular taxa [16]. We note here that, in view of our results, the latter would require allometric correction, whereas the former would not (as done in the dataset presented in Fig. 2) and is thus a preferable approach. However, we also note that the total number of ribosomal gene copies in a sample is not a good absolute measure of the combined microbial biomass or number of cells present for comparisons among samples, as it will be dependent on their inherent cell-size distribution. Hence, comparisons among samples will only be meaningful if carried out after conversion to biomass or cell numbers, unless the microbial composition of the samples is unchanged.


The procedure outlined here requires knowledge of morphological metadata in addition to sequencing counts for each taxon. Unfortunately, cell volume data are not readily available for many taxa, at least in a compiled format, and requires intensive literature searches. In its absence, and as an approximation, using a few discrete cell-size classes instead of exact values yields useful corrected distributions (see Figure S1 in Additional file 4). Yet, an effort to bring microbial size data into a consolidated platform would be desirable in that it would enable the processing of large datasets in an automated, more manageable way.

An additional factor to take into account is the substantial data spread around the fit leading to Eq. 1, which can limit the precision of the correction. An expanded dataset should improve predictive accuracy and perhaps even precision, but some inherent limitations are also at play. P can vary in a single strain with cell cycle [15] and growth conditions [27]. We have included the range of intraspecies variability on the Vc/Rc space in Fig. 1 for the cases of a single strain of Synechococcus elongatus, and of single cells from natural populations of Sphaerozoum fuscum and Colozoum pelagicum. They suggest that a significant proportion of spread can be attributed to biological intraspecies variability, tempering the prospects for improvement with eventually extended datasets. Studies on Synechococcus elongatus [28,29,30] and Saccharomyces cerevisae [31] point to a regulatory interdependency of P with cell size, indicating that natural variations in ploidy may be met by commensurate variation in volume, making this much less of a problem. Additionally, because the data used here were arrived at through several approaches, a dedicated survey based on more consistent analytical procedures may result in tighter fits. Finally, part of the variability detected may have been due to neglecting contributions of organelle ribosomal genes in protists. This is expected to be negligible for large-celled eukaryotes, but perhaps not so much for the smallest of them, in which organelles take up a larger portion of their cell volume. Indeed, some of these pico-eukaryotes contribute disproportionately (by defect in Rc) to the regression’s sum of squares and may have contributed to the somewhat higher exponent in the eukaryote-only fit (Additional file 2). In support of this contention, a re-analysis excluding eukaryotes with Vc < 20 µm3 yields an exponent (0.66 ± 0.06; R2 = 0.68), more in line with that of Eq. 1, showing no trace of statistical difference (T test, p = 0.68) with that of the prokaryote-only fit (Additional file 2). While the dataset does not allow us to differentiate between bacteria and archaea because of the low number of archaeal cases in it, given the substantial biological differentiation between bacteria and archaea, it may be an interesting future exercise.

The preceding discussion on uncertainty in the correction approach should not be taken as grounds for inaction, given that the range of variation in Vc far exceeds that traceable to deviations from the fit, not only among microbes at large, but also in specific settings, and the spectra of microbial size distribution in microbiomes seems to be dynamic. For example, the range of Vc of typical bacterioplankton (excluding phototrophs) in seawater spans 3 orders of magnitude and its spectrum can be modified significantly by factors like grazing [32]. Considering photosynthetic plankton would likely add another 4–5 orders of magnitude in Vc, and the size spectrum of this group is also affected by environmental parameters [33]. In the human gut microbiome, our initial assessments show that microbiome typical bacteria span over at least 4 orders of magnitude in volume.

Beyond the pragmatic uses for community composition corrections, we see it as unlikely that the apparent scaling relationship with cell surface area has no biological meaning. It is tempting to speculate that Rc scales with size to maintain an increasing protein content need. Indeed, protein content scales as a function of cell volume with a similar exponent of 0.70 ± 0.06 (R2 = 0.87; 95% CI 0.64–0.75) [34]. Because the CI of the exponent for protein content per cell and that for Rc in the fit of Fig. 1 [0.72 and 0.61; see (Additional file 2)] overlap, the possibility of a connection to cellular need for proteins cannot be rejected solely on this basis. Indeed, in Synechococcus elongatus in the laboratory, protein content and P (hence also Rc) strongly co-vary with cell volume [30].

Alternatively, and perhaps more trivially, the scaling relationship of Rc with Vc may simply be a reflection of the size scaling of DNA content per cell. In other words, ribosomal genes would follow the trends of DNA content as a whole, just like any other universal gene would. The allometric relationship between DNA content and cell size, however, has not been addressed in the literature or has been addressed incorrectly by neglecting ploidy [34, 35]. We know that genome size scales among bacteria with reported exponents between 0.21 (R2 = 0.60) [34] and 0.35 (R2 = 0.45) [36]. In our dataset, which includes eukaryotes, it does so with an exponent of 0.18 (R2 = 0.34; see Additional file 5). Even when these coefficients of correlation are rather poor, genome size clearly increases much more weakly with Vc than does Rc. Again, this does not take into account P variations to yield actual DNA content per cell; it is the size of one copy of the genome. A portion of our dataset can be used to explore the scaling of DNA content per cell for prokaryotes (n = 60). To this subset, we can add the measured or slightly derived values reported by Shuter et al. [35] (n = 39), excluding those that relied on assumptions of monoploidy. This combined set yields a power scaling fit with R2 = 0.89 and estimated exponent of ¾ (0.75 ± 0.03; Fig. 3).

Fig. 3
figure 3

DNA content scales with cell volume as a power function with an exponent of ¾. Entries are from a subset of those in Table S1 (n = 60, see Additional file 1), and determinations by Shuter et al. [35] (n = 39). Orange points belong to eukaryotic microbes, yellow points belong to archaea, and green points to bacteria. Full statistics for the fit (in red type) are given in Table S2 (Additional file 2)

The difference in scaling exponent between genome size and cell DNA content (0.18–035 vs. 0.75) gauges the importance of P. In fact, in our dataset, P seems to scale with Vc as a power law with an exponent of 0.54 (R2 = 0.69; Additional file 6). This is consistent with the fact that the product of genome size and ploidy yields the cell DNA content, as the exponents of the multipliers (0.18 and 0.54, respectively) roughly add up to the estimated exponent of the product (0.75). That the exponents for DNA content per cell (3/4) and Rc (2/3) are significantly different (T test, p = 0.02), speaks for respective mechanistic drivers that are fundamentally decoupled. In fact, most known allometric laws found in nature scale with exponents that are simple multiples of 1/4 [37]. It would seem that ribosomal genes are, in that sense, unique.


The results presented here uncover surprising basic rules on the composition of microbes, rules that ties them all together, and that far from being self-evident, pose an intellectual challenge to elucidate. In practical terms, this discovery also provides a rather simple approach to deal with biases affecting the use of current omics methodologies for the assessment of microbiome composition, which, given their extensive use in many areas of microbiology and allied sciences, has a large potential for applicability.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files.


R g :

Ribosomal gene copy number per genome

R c :

Ribosomal gene copy number per cell

R s :

Total copies of the ribosomal gene for all taxa considered in a sample

V c :

Cell volume

B c :

Cell biomass

D c :

Cell diameter




  1. Brooks JP, et al. The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC Microbiol. 2015;15(1):1–14.

    Article  Google Scholar 

  2. Lavrinienko A, Jernfors T, Koskimäki JJ, Pirttilä AM, Watts PC. Does Intraspecific Variation in rDNA Copy Number Affect Analysis of Microbial Communities?. Trends Microbiol. 2021;29(1):19–27.

  3. Stoddard SF, Smith BJ, Hein R, Roller BRK, Schmidt TM. rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development. Nucleic Acids Res. 2014;43(D1):D593–8.

    Article  Google Scholar 

  4. Klappenbach JA, Dunbar JM, Schmidt TM. rRNA operon copy number reflects ecological strategies of bacteria. Appl Environ Microbiol. 2000;66(4):1328–33.

    Article  CAS  Google Scholar 

  5. Roller BR, Stoddard SF, Schmidt TM. Exploiting rRNA operon copy number to investigate bacterial reproductive strategies. Nat Microbiol. 2016;1(11):1–7.

    Article  Google Scholar 

  6. Vieira-Silva S, Rocha EP. The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 2010;6(1):e1000808.

  7. Kembel SW, Wu M, Eisen JA, Green JL. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput Biol. 2012;8(10):e1002743.

  8. Angly FE, et al. CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction. Microbiome. 2014;2(1):1–13.

    Article  Google Scholar 

  9. Lofgren LA, et al. Genome-based estimates of fungal rDNA copy number variation across phylogenetic scales and ecological lifestyles. Mol Ecol. 2019;28(4):721–30.

    Article  Google Scholar 

  10. Louca S, Doebeli M, Parfrey LW. Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem. Microbiome. 2018;6(1):41.

    Article  Google Scholar 

  11. Kwan EX, Wang XS, Amemiya HM, Brewer BJ, Raghuraman MK. rDNA Copy Number Variants Are Frequent Passenger Mutations in Saccharomyces cerevisiae Deletion Collections and de Novo Transformants. G3 (Bethesda, Md.). 2016;6(9):2829–38.

  12. Pecoraro V, Zerulla K, Lange C, Soppa J. Quantification of ploidy in proteobacteria revealed the existence of monoploid, (mero-)oligoploid and polyploid species. PloS one. 2011;6(1):e16392.

  13. Trun NJ. Genome Ploidy. In: de Bruijn FJ, Lupski JR, Weinstock GM. (eds) Bacterial Genomes. Boston: Springer; 1998.

  14. Soppa J. Polyploidy in archaea and bacteria: about desiccation resistance, giant cell size, long-term survival, enforcement by a eukaryotic host and additional aspects. J Mol Microbiol Biotechnol. 2014;24(5–6):409–19.

    CAS  Google Scholar 

  15. Maldonado R, Jiménez J, Casadesús J. Changes of ploidy during the Azotobacter vinelandii growth cycle. J Bacteriol. 1994;176(13):3911–9.

    Article  CAS  Google Scholar 

  16. Bonk F, Popp D, Harms H, Centler F. PCR-based quantification of taxa-specific abundances in microbial communities: quantifying and avoiding common pitfalls. J Microbiol Methods. 2018;153:139–47.

    Article  CAS  Google Scholar 

  17. Soppa J. Polyploidy and community structure Nature microbiology. 2017;2(2):1–2.

    Google Scholar 

  18. Godhe A, et al. Quantification of diatom and dinoflagellate biomasses in coastal marine seawater samples by real-time PCR. Appl Environ Microbiol. 2008;74(23):7174–82.

    Article  CAS  Google Scholar 

  19. Biard T, et al. Biogeography and diversity of Collodaria (Radiolaria) in the global ocean. ISME J. 2017;11(6):1331–44.

    Article  Google Scholar 

  20. Guerrero R, Pedrós-Alió C, Schmidt TM, Mas J. A survey of buoyant density of microorganisms in pure cultures and natural samples. Microbiologia (Madrid, Spain). 1985;1(1–2):53–65.

    CAS  Google Scholar 

  21. Jasso-Selles DE, De Martini F, Velenovsky JF, IV Mee ED, Montoya SJ, Hileman JT, Garcia MD, Su NY, Chouvenc T, Gile GH. The Complete Protist Symbiont Communities of Coptotermes formosanus and Coptotermes gestroi: Morphological and Molecular Characterization of Five New Species. J Eukaryot Microbiol. 2020;67:626-41.

  22. Roush D, Garcia-Pichel F. Succession and colonization dynamics of endolithic phototrophs within intertidal carbonates. Microorganisms. 2020;8(2):214.

    Article  CAS  Google Scholar 

  23. Chacón E, Berrendero E, Pichel FG. Biogeological signatures of microboring cyanobacterial communities in marine carbonates from Cabo Rojo. Puerto Rico Sedimentary Geology. 2006;185(3–4):215–28.

    Article  Google Scholar 

  24. Overmann J, Garcia-Pichel F. The phototrophic way of life. The prokaryotes. 2006;2:32.

    Article  Google Scholar 

  25. Props R, et al. Absolute quantification of microbial taxon abundances. ISME J. 2017;11(2):584–7.

    Article  Google Scholar 

  26. Fernandes VM, et al. Exposure to predicted precipitation patterns decreases population size and alters community structure of cyanobacteria in biological soil crusts from the Chihuahuan Desert. Environ Microbiol. 2018;20(1):259–69.

    Article  Google Scholar 

  27. Paranjape SS, Shashidhar R. The ploidy of Vibrio cholerae is variable and is influenced by growth phase and nutrient levels. FEMS Microbiol Lett. 2017;364(19):10.1093/femsle/fnx190.

  28. Ohbayashi R, Nakamachi A, Hatakeyama TS, Watanabe S, Kanesaki Y, Chibazakura T, Yoshikawa H, Miyagishima SY. Coordination of Polyploid Chromosome Replication with Cell Size and Growth in a Cyanobacterium. mBio. 2019;10(2):e00510–19.

  29. X-y Z, O’Shea EK. Cyanobacteria maintain constant protein concentration despite genome copy-number variation. Cell Rep. 2017;19(3):497–504.

    Article  Google Scholar 

  30. Chen AH, Afonso B, Silver PA, Savage DF. Spatial and temporal organization of chromosome duplication and segregation in the cyanobacterium Synechococcus elongatus PCC 7942. PloS one. 2012;7(10):e47837.

  31. Mundkur BD. Interphase nuclei and cell sizes in a polyploid series of Saccharomyces. Experientia. 1953;9(10):373–4.

    Article  CAS  Google Scholar 

  32. Andersson A, Larsson U, Hagström Å. Size-selective grazing by a microflagellate on pelagic bacteria. Mar Ecol Prog Ser. 1986;33:51–57.

  33. Marañón E, et al. Patterns of phytoplankton size structure and productivity in contrasting open-ocean environments. Mar Ecol Prog Ser. 2001;216:43–56.

    Article  Google Scholar 

  34. Kempes CP, Wang L, Amend JP, Doyle J, Hoehler T. Evolutionary tradeoffs in cellular composition across diverse bacteria. ISME J. 2016;10(9):2145–57.

    Article  CAS  Google Scholar 

  35. Shuter BJ, Thomas J, Taylor WD, Zimmerman AM. Phenotypic correlates of genomic DNA content in unicellular eukaryotes and other cells. Am Nat. 1983;122(1):26–44.

    Article  CAS  Google Scholar 

  36. DeLong JP, Okie JG, Moses ME, Sibly RM, Brown JH. Shifts in metabolic scaling, production, and efficiency across major evolutionary transitions of life. Proc Natl Acad Sci. 2010;107(29):12941–5.

    Article  CAS  Google Scholar 

  37. West GB, Woodruff WH, Brown JH. Allometric scaling of metabolic rate from molecules and mitochondria to cells and mammals. Proc Natl Acad Sci. 2002;99(suppl 1):2473–8.

    Article  Google Scholar 

Download references


We thank Susanne Neuer, Gillian Gile, and Damien Finn for critically reading the manuscript and D. Roush for walking us through the data used in the exemplary corrections.


This work was supported in part by NSF grant #1449501.

Author information

Authors and Affiliations



FGP conceived the idea. LGS carried out the database research. LGS and FGP run the analyses and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ferran Garcia-Pichel.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1

. Taxon-specific values for primary variables (cell volume, ribosomal gene copies per cell, cellular DNA content and ploidy) as well as source variables (cell shape, cell axial dimension, ribosomal gene copies per genome and genome size) as used in the analyses presented in Fig.1

Additional file 2:

Table S2. Statistics and estimated parameters for power fits against Vc.

Additional file 3:

Table S3. Explicit dataset used in Fig. ure 2. Original 16S rRNA gene amplicon sequencing data, taxonomic assignments, and qPCR 16S rRNA gene quantifications are from Roush et al. (2020). Estimations of taxon-specific cell number and taxon-specific biovolume according to Materials and Methods.

Additional file 4: Figure S1

. Differences in allometric estimation of microbial community structure as cell number or biovolume from 16S rRNA gene counts in the dataset of Fig.  2 by either assigning measured cell volume values to taxa or by assigning taxa to a set of discrete size ranges. Left: stack bar graphs for relative proportions of taxa. Right: frequency histograms for taxa-specific percentual differences between the two approaches. 

Additional file 5:

Figure S2. Relationship between genome size and cell volume (Vc) in microbes (n = 56), plotted as a log/log graph. The grey line is a power fit with the equation displayed in red type (fit statistics are in Suppl. Table 2). Datapoints belonging to eukaryotes are in orange, those for prokaryotes in green.

Additional file 6:

Figure S3. Relationship between ploidy (P) and cell volume (Vc) in microbes (n = 56), plotted as a log/log graph. The grey line is a power fit with the equation displayed in red type (fit statistics are in Suppl. Table 2). Datapoints belonging to eukaryotes are in orange, those for prokaryotes in green.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gonzalez-de-Salceda, L., Garcia-Pichel, F. The allometry of cellular DNA and ribosomal gene content among microbes and its use for the assessment of microbiome community structure. Microbiome 9, 173 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: