Skip to main content

Table 2 Comparison of HMM-GRASPx and MetaCLADE against a simulated 100-bp marine data set with uneven coverage

From: A multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling

Pathway HMM-GRASPx MetaCLADE/HMM-GRASPx-Assemblyc MetaCLADE/GC-Assemblyd
  TP FP FN Recall Prec F-score TP FP FN Recall Prec F-score TP FP FN Recall Prec F-score
“Strict” domain annotationa
 KO00010 159 243 2 216 39 938 79. 9 98.6 88.3 171 168 5 384 24 845 87.3 97.0 91.9 179 972 7 397 14 028 92.8 96.1 94.4
 KO00020 182 706 11 691 74 704 71.0 94.0 80.9 201 808 24 786 42 507 82.6 89.1 85.7 223 671 20 452 24 978 90.0 91.6 90.8
 KO00030 101 449 589 17 761 85.1 99.4 91.7 107 065 833 11 901 90.0 99.2 94.4 113 446 859 5 494 95.4 99.2 97.3
 KO00051 436 432 17 970 192 918 69.3 96.0 80.5 482 994 50 714 113 612 81.0 90.5 85.5 553 728 41 566 52 026 91.4 93.0 92.2
 KO00620 237 743 11 756 100 903 70.2 95.3 80.8 263 011 27 306 60 085 81.4 90.6 85.8 297 162 21 038 32 202 90.2 93.4 91.8
 KO00680 595 304 24 101 229 802 72.1 96.1 82.4 633 158 74 709 141 340 81.8 89.4 85.4 695 625 70 569 83 013 89.3 90.8 90.1
 KO00910 219 501 16 171 86 939 71.6 93.1 81.0 238 683 30 435 53 493 81.7 88.7 85.0 264 438 25 270 32 903 88.9 91.3 90.1
 KO00920 74 851 138 37 662 66.5 99.8 79.8 82 977 1 098 28 576 74.4 98.7 84.8 105 794 603 6 254 94.4 99.4 96.9
“Clan-based” domain annotationb
 KO00010 161 459 0 39 938 80.2 100.0 89.0 176 369 183 24 845 87.7 99.9 93.4 187 308 61 14 028 93.0 100.0 96.4
 KO00020 194 397 0 74 704 72.2 100.0 83.9 226 585 9 42 507 84.2 100.0 91.4 244 114 9 24 978 90.7 100.0 95.1
 KO00030 102 038 0 17 761 85.2 100.0 92.0 107 867 31 11 901 90.1 100.0 94.8 114 275 30 5 494 95.4 100.0 97.6
 KO00051 454 402 0 192 918 70.2 100.0 82.5 533 583 125 113 612 82.4 100.0 90.4 594 999 295 52 026 92.0 100.0 95.8
 KO00620 249 492 7 100 903 71.2 100.0 83.2 288 167 2 150 60 085 82.7 99.3 90.3 317 646 554 32 202 90.8 99.8 95.1
 KO00680 618 689 716 229 802 72.9 99.9 84.3 701 356 6 511 141 340 83.2 99.1 90.5 756 885 9 309 83 013 90.1 98.8 94.3
 KO00910 235 260 412 86 939 73.0 99.8 84.3 267 624 1 494 53 493 83.3 99.4 90.7 288 068 1 640 32 903 89.7 99.4 94.3
 KO00920 74 989 0 37 662 66.6 100.0 79.9 84 053 22 28 576 74.6 100.0 85.5 106 392 5 6 254 94.4 100.0 97.1
  1. aOnly hits having same Pfam domain with respect to the ground-truth are counted as true positives.
  2. bDomain hits that belong to the same clan with respect to the ground-truth are counted as true positives.
  3. cAnnotation obtained by applying MetaCLADE on the assembled contigs of HMM-GRASPx.
  4. dAnnotation obtained by applying MetaCLADE on the gene catalog
  5. Largest values are reported in italics