Identification and evaluation of misassembled contigs. (A) Strategy used to identify misassembled contigs with the potential to align to multiple bacterial proteins. First, we perform a database search to identify proteins aligning to the contig (1). Next, iterating from the start of the contig, we identify the set of highest scoring non-overlapping alignments (2). Based on these, the contig is subsequently fragmented (3). (B) Incidence of misassembles, as defined from the heuristic presented in (A), generated from both the single-end and paired-end read datasets generated from the NOD503CecMN sample (left panel). Also shown is the proportion of intact contigs and fragments which align <90% of their length to a known protein (right panel).