Fig. 3From: Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics dataClassification accuracy of decontam on microbial sequences in an oral mucosa dataset. a The scores assigned by the frequency and prevalence methods are plotted for each ASV present in two or more samples in the oral mucosa dataset. Points are colored if their genus can be unambiguously classified as oral or contaminant by comparison to a compiled reference database. b ASVs with unambiguous reference-based classifications were used as the ground truth for evaluating the sensitivity and specificity of decontam. Receiver-operator-characteristic (ROC) curves are plotted for the frequency, prevalence, and combined methods at the level of ASVs and at the level of reads (i.e., weighting by ASV relative abundance). Points show the default classification threshold of P* = 0.1Back to article page