Skip to main content
Fig. 4 | Microbiome

Fig. 4

From: Construction of habitat-specific training sets to achieve species-level assignment in 16S rRNA gene datasets

Fig. 4

Trimming the training set to the specific sequenced region further reduces the error rate. a The percentage of eHOMD-derived simulated reads classified at species level using the FL_Compilation_TS (orange) training set compared to subsequent trimmed versions V1V3_Raw_TS (green) and V1V3_Curated_TS (red). b The percentage of classified reads that were misclassified with each of these three training sets. c This graph, which is specific to the eHOMD training set construction (V1V3_eHOMDSim_250N100 dataset), indicates how researchers can determine the bootstrap value to use with the naïve Bayesian RDP Classifier by deciding an acceptable level of the % of reads misclassified (blue line; e.g., 0.5%) and/or of the % of reads that are not classified (red line). The naïve Bayesian RDP Classifier was used with bootstrap values ranging from 50 to 100

Back to article page