Skip to main content

Table 2 Optimized methods configurations for standard operating conditions

From: Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin

     Mock Cross-validated Novel taxa  
Target Condition Method Parameters F P R F P R F P R Threshold
16S rRNA gene Balanced NB-bespoke [6,6]:0.9 0.705 0.98 0.582 0.827 0.931 0.744 0.165 0.243 0.125 F = (0.49, 0.8, 0.1)
   [6,6]:0.92 0.705 0.98 0.581 0.825 0.936 0.737 0.165 0.251 0.123 F = (0.7, 0.8, 0.15)
   [6,6]:0.94 0.703 0.98 0.579 0.822 0.942 0.729 0.162 0.259 0.118  
   [7,7]:0.92 0.712 0.978 0.592 0.831 0.931 0.751 0.151 0.221 0.115  
   [7,7]:0.94 0.708 0.978 0.586 0.829 0.936 0.743 0.157 0.239 0.117  
  Naive-Bayes [7,7]:0.7 0.495 0.797 0.38 0.819 0.886 0.761 0.115 0.138 0.099  
  rdp 0.6 0.564 0.798 0.457 0.815 0.868 0.768 0.102 0.128 0.084  
   0.7 0.55 0.799 0.438 0.812 0.892 0.746 0.124 0.173 0.096  
  Uclust 0.51:0.9:3 0.498 0.746 0.392 0.846 0.876 0.817 0.154 0.201 0.126  
Precision NB-bespoke [6,6]:0.98 0.676 0.987 0.537 0.803 0.956 0.692 0.163 0.303 0.111 P = (0.94, 0.95, 0.25)
   [7,7]:0.98 0.687 0.98 0.551 0.815 0.951 0.713 0.164 0.283 0.115  
  rdp 1 0.239 0.941 0.16 0.632 0.968 0.469 0.12 0.457 0.069  
Recall NB-bespoke [12,12]:0.5 0.754 0.8 0.721 0.815 0.83 0.801 0.053 0.058 0.049 R = (0.47, 0.75, 0.04)
   [14,14]:0.5 0.758 0.802 0.726 0.811 0.826 0.797 0.052 0.057 0.048 R = (0.7, 0.75, 0.04)
   [16,16]:0.5 0.755 0.785 0.732 0.808 0.825 0.792 0.052 0.058 0.047  
   [18,18]:0.5 0.772 0.803 0.748 0.805 0.823 0.789 0.055 0.061 0.05  
   [32,32]:0.5 0.937 0.966 0.913 0.788 0.818 0.76 0.054 0.067 0.045  
  Naive-Bayes [11,11]:0.5 0.567 0.77 0.479 0.793 0.82 0.768 0.059 0.065 0.055  
   [12,12]:0.5 0.567 0.769 0.479 0.79 0.816 0.765 0.059 0.064 0.055  
   [18,18]:0.5 0.564 0.764 0.477 0.779 0.807 0.753 0.057 0.063 0.051  
  rdp 0.5 0.577 0.791 0.48 0.816 0.848 0.787 0.068 0.079 0.06  
Novel Blast+ 10:0.51:0.8 0.436 0.723 0.325 0.816 0.896 0.749 0.225 0.332 0.171 F = (0.4, 0.8, 0.2)
  Uclust 0.76:0.9:5 0.467 0.775 0.348 0.84 0.938 0.76 0.219 0.358 0.158  
  VSEARCH 10:0.51:0.8 0.45 0.74 0.342 0.814 0.891 0.75 0.226 0.333 0.171  
   10:0.51:0.9 0.45 0.74 0.342 0.82 0.896 0.755 0.219 0.338 0.162  
Fungi Balanced Naive-Bayes [6,6]:0.94 0.874 0.935 0.827 0.481 0.57 0.416 0.374 0.438 0.327 F = (0.85, 0.45, 0.37)
    [6,6]:0.96 0.874 0.935 0.827 0.495 0.597 0.423 0.399 0.473 0.344  
    [6,6]:0.98 0.874 0.935 0.827 0.505 0.629 0.423 0.426 0.52 0.361  
    [7,7]:0.98 0.874 0.935 0.827 0.485 0.596 0.409 0.388 0.47 0.33  
   NB-bespoke [6,6]:0.94 0.928 0.968 0.915 0.48 0.567 0.416 0.371 0.433 0.325  
    [6,6]:0.96 0.928 0.968 0.915 0.491 0.59 0.42 0.393 0.466 0.34  
    [6,6]:0.98 0.927 0.97 0.913 0.504 0.624 0.422 0.421 0.512 0.358  
    [7,7]:0.98 0.935 0.97 0.921 0.487 0.596 0.412 0.386 0.466 0.329  
   rdp 0.7 0.929 0.939 0.922 0.479 0.572 0.413 0.382 0.451 0.332  
    0.8 0.924 0.939 0.915 0.507 0.633 0.422 0.434 0.534 0.366  
    0.9 0.922 0.937 0.913 0.517 0.698 0.411 0.47 0.617 0.379  
  Precision Naive-Bayes [6,6]:0.98 0.874 0.935 0.827 0.505 0.629 0.423 0.426 0.52 0.361 P = (0.92, 0.6, 0.3)
   NB-bespoke [6,6]:0.98 0.927 0.97 0.913 0.504 0.624 0.422 0.421 0.512 0.358  
   rdp 0.8 0.924 0.939 0.915 0.507 0.633 0.422 0.434 0.534 0.366  
    0.9 0.922 0.937 0.913 0.517 0.698 0.411 0.47 0.617 0.379  
    1 0.821 0.943 0.742 0.461 0.81 0.322 0.459 0.774 0.327  
  Recall NB-bespoke [6,6]:0.92 0.938 0.971 0.924 0.467 0.544 0.409 0.353 0.407 0.312 R = (0.9, 0.4, 0.3)
    [6,6]:0.94 0.928 0.968 0.915 0.48 0.567 0.416 0.371 0.433 0.325  
    [6,6]:0.96 0.928 0.968 0.915 0.491 0.59 0.42 0.393 0.466 0.34  
    [6,6]:0.98 0.927 0.97 0.913 0.504 0.624 0.422 0.421 0.512 0.358  
    [7,7]:0.96 0.935 0.969 0.921 0.47 0.56 0.404 0.357 0.422 0.31  
    [7,7]:0.98 0.935 0.97 0.921 0.487 0.596 0.412 0.386 0.466 0.329  
   rdp 0.7 0.929 0.939 0.922 0.479 0.572 0.413 0.382 0.451 0.332  
    0.8 0.924 0.939 0.915 0.507 0.633 0.422 0.434 0.534 0.366  
    0.9 0.922 0.937 0.913 0.517 0.698 0.411 0.47 0.617 0.379  
  Novel Naive-Bayes [6,6]:0.98 0.874 0.935 0.827 0.505 0.629 0.423 0.426 0.52 0.361 F = (0.85, 0.45, 0.4)
   NB-bespoke [6,6]:0.98 0.927 0.97 0.913 0.504 0.624 0.422 0.421 0.512 0.358  
   rdp 0.8 0.923 0.939 0.915 0.507 0.633 0.422 0.434 0.534 0.366  
    0.9 0.921 0.937 0.913 0.517 0.698 0.411 0.47 0.617 0.379  
  1. aF, F-measure; P, precision; R, recall
  2. bNaive Bayes parameters: k-mer range, confidence
  3. cRDP parameters: confidence
  4. dBLAST+/VSEARCH parameters: max accepts, minimum consensus, minimum percent identity
  5. eUCLUST parameters: minimum consensus, similarity, max accepts
  6. fThreshold describes the score cut-offs used to define optimal method ranges, in the following format: [metric = (mock score, cross-validated score, novel-taxa score)]. If two cut-offs are given, the second indicates a higher cut-off used to select parameters for the developmental NB-bespoke method, and the configurations listed are the union of the two cutoffs: the second cutoff for selecting NB-bespoke, the first for selecting all other methods