Skip to main content
Fig. 1 | Microbiome

Fig. 1

From: Validation and standardization of DNA extraction and library construction methods for metagenomics-based human fecal microbiome measurements

Fig. 1

Comparison of protocols for sequencing library construction. a Compositional PCA ordination plot of measured DNA mock community compositions, based on clr (centered log ratio) transformed abundances. The red bold letter T depicts the expected composition (“ground truth”) projected onto the PCA ordination and symbols show individual replicates. Values in the axis labels represent the percentage of variance explained. Protocol identifiers were overlayed with jitter to prevent overlapping labels. b Dependence of the metric variance of measured compositions on DNA input amount and corresponding PCR conditions for library amplification (X0, XL, and XH). c Relationship between differences in genomic GC content of pairs of genomes/strains and their contribution to the metric variance shown in panel b. d Protocol-dependent variation in quantification bias due to genomic GC content. The GC bias metric represents the slope of the intercept-free linear regression line of log2-transformed abundance ratios for all possible pairs of strains to their differences in genomic GC content (see Fig. S3). e Variation in proportion of PCR duplicates. Protocols are ordered along the y-axis as in panel d and both panels share a common y-axis. f, g Closeness of agreement between the ground truth and measured compositions, expressed in terms of Aitchison distances (f) and absolute fold-differences (g). Kits are ranked along the y-axis based on Aitchison distances, averaged across DNA input amounts for each of the kits. For panel g, colored symbols show the geometric mean of strain-wise absolute fold-differences to the ground truth (that is, gmAFD) and black circles represent fold-differences for individual strains. h Heatmap of pairwise Aitchison distances showing quantitative consistency of measured compositions among protocols. i Variation in fragmentation bias, expressed as Aitchison distances between observed and expected base frequencies averaged across the first fifteen cycles of the forward read (see Fig. S4). j Variation in N50 values of the DNA mock community metagenome assemblies. For panels gj, protocols are sorted as in panel f. For panels b, c, and fh, values were computed based on the center (compositional mean) of three technical replicates. For panels d, e, i, and j, results are shown as the mean (symbols) and standard deviation (error bars), if visible, of three technical replicates. Across all panels, common symbol fill colors and shapes reflect kits and DNA input amounts, respectively, as shown in the legend of panel a

Back to article page