Skip to main content
Fig. 2 | Microbiome

Fig. 2

From: Validation and standardization of DNA extraction and library construction methods for metagenomics-based human fecal microbiome measurements

 Fig. 2

Comparison of protocols for DNA extraction. a Compositional PCA ordination plot of measured cell mock community compositions, based on clr (centered log ratio) transformed abundances. The red bold letter T shows the expected composition (“ground truth”) projected onto the PCA ordination and symbols represent individual replicates. Values in the axis labels represent the percentage of variance explained. For protocols N, L, and S, arrows show approximate trajectories of measurements for DNA extractions performed with increasing total bead-beating time. b Relationship between the Gram-type cell walls of pairs of strains and their contribution to the metric variance across protocols. c Cumulative relative abundance of Gram-positives (denoted as G+) as a function of bead-beating regime. The dashed horizontal line represents the expected proportion. d Measured abundances of different strains, relative to E. coli strain NBRC 3301, as a function of bead-beating regime for protocol N. Colors represent different Gram-positives and results for all Gram-negatives are shown as dotted gray lines. e Ranking of protocols based on the closeness of agreement between the ground truth and measured compositions, expressed in terms of Aitchison distances. f Effect of total bead-beating time on agreement between measured compositions and the ground truth, expressed in terms of Aitchison distances (left panel) and gmAFDs (right panel). Horizontal dashed lines represent corresponding values for protocol Q. g Scatter plots showing quantitative agreement between community profiles measured with protocol Q (x-axis) and protocols L, N, and S (y-axis) for the cell mock community (upper plots) and fecal sample S01 (lower plots). For the fecal sample, relative abundances were calculated as the percentage of reads assigned to a given species by kraken2. Gray areas represent up to 1.5- or 2-fold differences for the upper and lower plots, respectively. Data represent the mean and standard deviation of two or three technical replicates and corresponding gmAFDs calculated based on the means are indicated in the facet labels. For panels c and d, results are shown as the mean (symbols or lines) and the standard deviation (error bars or ribbons, if visible) of two or three technical replicates. For panels e and f, values were computed based on the center (compositional mean) of two or three technical replicates. Across all panels, common symbol and line colors reflect DNA extraction protocols/kits, as shown in the legend of panel a

Back to article page