Figure 4
From: Comparison of assembly algorithms for improving rate of metatranscriptomic functional annotation

Identification and evaluation of misassembled contigs. (A) Strategy used to identify misassembled contigs with the potential to align to multiple bacterial proteins. First, we perform a database search to identify proteins aligning to the contig (1). Next, iterating from the start of the contig, we identify the set of highest scoring non-overlapping alignments (2). Based on these, the contig is subsequently fragmented (3). (B) Incidence of misassembles, as defined from the heuristic presented in (A), generated from both the single-end and paired-end read datasets generated from the NOD503CecMN sample (left panel). Also shown is the proportion of intact contigs and fragments which align <90% of their length to a known protein (right panel).