Skip to main content

Table 1 For each gene family studied, we report the KEGG orthology group, number of reads assigned to that group by DIAMOND, number of reference gene sequences that exist in the synthetic community, and number of reference genes “detected” by each method: MEGAN, IDBA-UD, Ray, SOAPdenovo, and Xander

From: Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads

Gene family KEGG Reads References MEGAN IDBA-UD Ray SOAP Xander
Acetyl-CoA C-acetyltransferase K00626 58,135 64 31 16 12 12 23
Archael rpoB1 K03044 17,875 16 7 7 6 3 9
Archael rpoB2 K03045 12,025 16 8 8 6 5 5
Cell division protein K03531 45,881 48 37 39 12 7 12
Bacterial rpoB K03043 105,212 64 43 16 12 13 50
Phenylalanyl-tRNA synthetase alpha subunit K01889 44,779 64 57 56 47 51 48
Phenylalanyl-tRNA synthetase beta subunit K01890 73,072 64 53 50 42 38 35
Phosphoribosylformylglycinamidine cyclo ligase K01933 31,919 64 58 59 46 45 54
Ribonuclease HII K03470 18,707 64 54 55 53 45 48
Ribosomal protein L1 K02863 24,190 64 57 49 45 48 53
Ribosomal protein L10 K02864 23,970 64 58 48 55 57 55
Ribosomal protein L11 K02867 17,113 64 60 50 51 60 59
Ribosomal protein L13 K02871 17,642 64 58 54 53 57 45
Ribosomal protein L14 K02874 13,435 64 56 42 49 58 60
Ribosomal protein L15 K02876 13,087 64 59 56 50 55 55
Ribosomal protein L16 K02878 10,058 64 46 34 36 44 44
Ribosomal protein L18 K02881 14,856 64 57 48 56 57 55
Ribosomal protein L2 K02886 29,849 64 60 54 46 55 57
Ribosomal protein L22 K02890 15,875 64 59 54 55 57 51
Ribosomal protein L24 K02895 11,786 64 60 46 56 58 44
Ribosomal protein L25 K02897 12,941 64 41 41 39 42 41
Ribosomal protein L29 K02904 4913 64 29 8 33 34 8
Ribosomal protein L3 K02906 30,192 64 59 47 51 57 51
Ribosomal protein L4 K02926 14,539 64 44 41 39 43 44
Ribosomal protein L5 K02931 20,533 64 60 58 55 59 58
Ribosomal protein L6 K02933 20,645 64 58 41 56 59 60
Ribosomal protein S10 K02946 11,327 64 56 42 48 56 54
Ribosomal protein S11 K02948 10,793 64 47 43 51 52 56
Ribosomal protein S12 K02950 14,199 64 61 41 48 60 58
Ribosomal protein S13 K02952 13,975 64 59 46 56 60 58
Ribosomal protein S15 K02956 10,795 64 54 16 43 55 50
Ribosomal protein S17 K02961 10,235 64 58 36 49 44 60
Ribosomal protein S19 K02965 12,479 64 59 39 51 59 58
Ribosomal protein S2 K02967 25,926 64 61 46 41 53 48
Ribosomal protein S3 K02982 25,722 64 59 46 48 57 57
Ribosomal protein S5 K02988 21,761 64 59 55 53 53 56
Ribosomal protein S7 K02992 20,520 64 60 42 54 60 61
Ribosomal protein S8 K02994 14,543 64 62 57 58 60 57
Ribosomal protein S9 K02996 12,927 64 59 52 52 58 61
Signal recognition particle protein K03110 27,386 64 35 48 36 19 46
Two-component system K03407 47,904 64 29 17 15 15 27
    Mean absolute deviation 9.34 19.73 18.24 15.41 14.17
  1. Best results are shown in bold. Mean absolute deviation between the number of references genes and the number detected by each method is reported as a summary statistic