Skip to main content
Fig. 1 | Microbiome

Fig. 1

From: IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences

Fig. 1

The IDTAXA algorithm exhibits relatively low OC error rates. Plots showing error rates versus the fraction of classifiable sequences classified as confidence is varied from 100% (left) to 0% (right). A better classifier will exhibit lower error rates during leave-one-out cross-validation while classifying the same fraction of classifiable sequences, shifting its curves downward. Misclassification (MC) error rates (dashed lines) are much lower than over classification (OC) error rates (solid lines) on three different training sets: the RDP training set of full-length 16S rRNA gene sequences (a), the Contax training set (b), and the Warcup ITS training set (c). The IDTAXA algorithm consistently displays the lowest OC error rates across different training sets. MC and OC error rates are higher when testing the shorter V4 region (~ 251 nucleotides) of the RDP training set (d). Points indicate error rates at default/recommended confidence thresholds: ≥ 95% sequence identity for BLAST, ≥ 70% confidence for QIIME, ≥ 60% confidence for IDTAXA, ≥ 50% confidence for MAPSeq, and ≥ 80% confidence for all others

Back to article page