Skip to main content

Table 1 A comparison of OTU-picking strategies

From: Context and the human microbiome

Strategy

Pros

Cons

Data combination bias

Closed-reference

• Is extremely parallelizable

• Is limited to finding diversity present in OTU reference

• May show large bias if combining studies with differential representation in the reference

• Computes reference assignments only once

• Is highly unlikely to retain non-16S sequences

• Supports and reads fragments from multiple loci

• Gets the phylogeny and taxonomy for free

De novo

• Utilizes all of the sequences

• Must hold all sequence data in memory

• May generate spurious OTUs if combining studies with differential error profiles

• Requires no OTU database

• Is very complex to parallelize

• Can group organisms distinct from anything seen before

• Produces spurious OTUs without pre-filtering

• May produce phylogenies sensitive to subtle differences in OTUs

• Is infeasible if data are from multiple loci

• Must redo OTU picking with all data being combined

Open-reference

• Leverages an OTU database but also utilizes sequences that do not match to that database

• Produces spurious OTUs without pre-filtering

• Shows less bias due to differential diversity representation than closed-reference

• Is infeasible if data are from multiple loci

• Is modestly parallelizable

• Must redo OTU picking with all data being combined

• Shows less bias due to differential error profiles than de novo