Skip to main content

Table 1 Summary of results

From: Annotated bacterial chromosomes from frame-shift-corrected long-read metagenomic data

  (a) (b) (c) (d) (e) (f) (g) (h) (i) (j)
  DIAMOND+MEGAN Unicycler Total Aligned Average CheckM Prokka
  taxonomic bin contigs (Mb) (Mb) coverage Complete. contam. rRNA tRNA CDS
High-quality draft genomes:           
B1 Bacteroidetes bacterium OLB12 1 4.2 3.5 57.3 95% 0.1% 6 39 4,163
B2 Candidatus Accumulibacter SK-02 1 5.2 4.1 384.2 94% 0.6% 4 53 4,915
B3 Chlamydiia (class) 1 2.8 1.8 48.8 94% 2% 6 39 3,387
B4 Gammaproteobacteria (class) 43 4.7 3.0   93% 2% 6 52 4,833
  -Longest contig   2.7 1.6 25.1 93% 0.2% 3 40 3,359
B5 Bacteroidetes bacterium OLB8 1 3.8 3.0 52.1 93% 1% 6 37 3,394
B6 Rhodospirillales (order) 1 4.4 3.0 29.5 92% 0.5% 3 47 4,015
B7 Chlorobi bacterium OLB5 1 3.5 2.5 38.7 88% 1% 3 41 4,131
Medium quality draft genomes:           
B8 Thauera (genus) 25 4.6 4.0   89% 4% 12 64 4,040
  -Longest contig   0.8 0.7 32.7 14% 0% 0 5 672
B9 Sphingobacteriales bacterium 44-15 59 3.2 2.8   76% 1% 2 17 2,953
  -Longest contig   0.2 0.1 10.2 0% 0% 0 0 172
B10 Bacteroidetes (phylum) 43 3.9 2.6   72% 7% 1 12 1,997
  -Longest contig   1.2 0.8 14.1 32% 0% 0 3 807
B11 Candidatus Contendobacter B J11 39 2.5 2.0   59% 9% 2 37 2,668
  -Longest contig   0.3 0.3 15.4 19% 0% 0 7 295
Low quality draft genomes:           
B12 Betaproteobacteria (class) 111 6.6 5.5   89% 79% 6 71 4,655
  -Longest contig   0.4 0.3 37.1 10% 0% 0 1 372
B13 Nitrospira (genus) 34 4.2 3.7   83% 13% 0 6 563
  -Longest contig   1.1 0.9 17.6 27% 0% 0 2 99
B14 Chloroflexi (phylum) 151 5.4 4.3   71% 29% 0 11 3,565
  -Longest contig   0.2 0.2 13.3 8% 0% 0 1 86
  1. For all 14 taxonomic bins B1–B14 that CheckM deems ≥ 50% complete (a), and -in cases where the bin contains more than one contig- also for the longest contig, in descending order of assembly quality, we report (b) the number of contigs produced by Unicycler, (c) the total number of bases, (d) the number of bases aligned by DIAMOND to some protein reference, (e) the average coverage by long reads (based on the longest contig), (f) the %-completeness and (g) %-contamination reported by CheckM, and (h)–(j), the number of rRNA, tRNA and coding sequences reported by Prokka, respectively