Skip to main content

Table 2 Values of parameters of the preprocessing methods[2, 11, 12]

From: A comprehensive evaluation of multicategory classification methods for microbiomic data

Parameter Value Description
otu picking method uclust uclust, creates ‘seeds’ of sequences which generate clusters based on percent identity.
clustering algorithm furthest Clustering algorithm for mothur otu picking method. Valid choices are: furthest, nearest, average.
max cdhit memory 400 Maximum available memory to cd-hit-est (via the program’s -M option) for cdhit OTU picking method (units of Mbyte)
refseqs fp None Path to reference sequences to search against when using -m blast, -m uclust_ref, or -m usearch_ref
blast db None Pre-existing database to blast against when using -m blast
similarity 0.97 Sequence similarity threshold (for cdhit, uclust, uclust_ref, or usearch)
max e value 1.00E-10 Max E-value when clustering with BLAST
prefix prefilter length None Prefilter data so seqs with identical first prefix_prefilter_length are automatically grouped into a single OTU
trie prefilter FALSE Prefilter data so seqs which are identical prefixes of a longer seq are automatically grouped into a single OTU
prefix length 50 Prefix length when using the prefix_suffix otu picker
suffix length 50 Suffix length when using the prefix_suffix otu picker
optimal uclust FALSE Pass the -optimal flag to uclust for uclust otu picking.
exact uclust FALSE Pass the -exact flag to uclust for uclust otu picking.
user sort FALSE Do not assume input is sorted by length
suppress presort by abundance uclust FALSE Suppress presorting of sequences by abundance when picking OTUs with uclust or uclust_ref
suppress new clusters FALSE Suppress creation of new clusters using seqs that don’t match reference when using -m uclust_ref or -m usearch_ref
suppress uclust stable sort FALSE Do not pass -stable-sort to uclust
max accepts 20 Max_accepts value to uclust and uclust_ref
max rejects 500 Max_rejects value to uclust and uclust_ref
word length 12 W value to usearch, uclust, and uclust_ref. Set to 64 for usearch.
stepwords 20 Stepwords value to uclust and uclust_ref
suppress uclust prefilter exact match FALSE Do not collapse exact matches before calling uclust