Skip to main content

Table 2 Values of parameters of the preprocessing methods[2, 11, 12]

From: A comprehensive evaluation of multicategory classification methods for microbiomic data

Parameter

Value

Description

otu picking method

uclust

uclust, creates ‘seeds’ of sequences which generate clusters based on percent identity.

clustering algorithm

furthest

Clustering algorithm for mothur otu picking method. Valid choices are: furthest, nearest, average.

max cdhit memory

400

Maximum available memory to cd-hit-est (via the program’s -M option) for cdhit OTU picking method (units of Mbyte)

refseqs fp

None

Path to reference sequences to search against when using -m blast, -m uclust_ref, or -m usearch_ref

blast db

None

Pre-existing database to blast against when using -m blast

similarity

0.97

Sequence similarity threshold (for cdhit, uclust, uclust_ref, or usearch)

max e value

1.00E-10

Max E-value when clustering with BLAST

prefix prefilter length

None

Prefilter data so seqs with identical first prefix_prefilter_length are automatically grouped into a single OTU

trie prefilter

FALSE

Prefilter data so seqs which are identical prefixes of a longer seq are automatically grouped into a single OTU

prefix length

50

Prefix length when using the prefix_suffix otu picker

suffix length

50

Suffix length when using the prefix_suffix otu picker

optimal uclust

FALSE

Pass the -optimal flag to uclust for uclust otu picking.

exact uclust

FALSE

Pass the -exact flag to uclust for uclust otu picking.

user sort

FALSE

Do not assume input is sorted by length

suppress presort by abundance uclust

FALSE

Suppress presorting of sequences by abundance when picking OTUs with uclust or uclust_ref

suppress new clusters

FALSE

Suppress creation of new clusters using seqs that don’t match reference when using -m uclust_ref or -m usearch_ref

suppress uclust stable sort

FALSE

Do not pass -stable-sort to uclust

max accepts

20

Max_accepts value to uclust and uclust_ref

max rejects

500

Max_rejects value to uclust and uclust_ref

word length

12

W value to usearch, uclust, and uclust_ref. Set to 64 for usearch.

stepwords

20

Stepwords value to uclust and uclust_ref

suppress uclust prefilter exact match

FALSE

Do not collapse exact matches before calling uclust