Accuracy of simulated metatranscriptome assemblies. For each simulated dataset, the accuracy of the reconstructed transcripts is evaluated based on their matches to the original set of transcripts used to generate the datasets. (A) Ten species dataset. (B) 72 species dataset. Shown is the percentage of contigs in each assembly which contain a region of at least one read length (76 bp) which does not align to a transcript at a variety of sequence cutoffs (97%–100% sequence identity). The gold standard assembly indicates the number of predicted misassemblies that are the result of introduced sequence errors during generation of the simulated datasets. Note this is higher for the ten species dataset as it includes a larger number of contigs than are generated by the assemblers (see text).