Skip to main content

Table 4 Classification accuracy without feature/operational taxonomic unit (OTU) selection, measured by proportion of correct classifications (PCC)

From: A comprehensive evaluation of multicategory classification methods for microbiomic data

Classifier

CBH

CS

CSS

FS

FSH

BP

PDX

PBS

Averages

P values

SVM, Linear C = 1

0.920

0.911

0.583

0.940

0.598

0.354

0.468

0.695

0.684

0.022

SVM, Linear optimized C

0.920

0.911

0.622

0.980

0.585

0.383

0.485

0.709

0.699

0.038

SVM, Poly

0.920

0.911

0.622

0.980

0.585

0.383

0.484

0.709

0.699

0.036

SVM, RBF

0.909

0.904

0.575

0.973

0.575

0.379

0.451

0.700

0.683

0.021

KRR, Poly

0.913

0.918

0.581

0.954

0.598

0.377

0.482

0.709

0.692

0.027

KRR, RBF

0.923

0.904

0.618

0.967

0.632

0.366

0.467

0.709

0.698

0.030

KNN, K = 1

0.496

0.360

0.195

0.451

0.305

0.249

0.419

0.291

0.346

0.002

KNN, K = 5

0.713

0.339

0.188

0.397

0.281

0.331

0.393

0.300

0.368

0.001

KNN, optimized K

0.714

0.377

0.192

0.325

0.273

0.340

0.409

0.379

0.376

0.001

PNN

0.743

0.321

0.216

0.522

0.332

0.325

0.167

0.247

0.359

0.000

L2-LR, C = 1

0.934

0.939

0.628

0.982

0.628

0.380

0.515

0.725

0.716

0.084*

L2-LR, optimized C

0.933

0.938

0.623

0.978

0.618

0.383

0.502

0.725

0.712

0.067*

L1-LR, C = 1

0.929

0.801

0.559

0.975

0.700

0.422

0.384

0.673

0.680

0.018*

L1-LR, optimized C

0.928

0.903

0.561

0.981

0.690

0.445

0.412

0.692

0.702

0.039

RF, default

0.932

0.955

0.673

0.999

0.744

0.508

0.424

0.730

0.746

0.270*

RF, optimized

0.938

0.956

0.689

0.994

0.760

0.523

0.423

0.735

0.752

-

BLR, Laplace priors

0.927

0.927

0.634

0.962

0.622

0.387

0.452

0.727

0.705

0.042

BLR, Gaussian priors

0.921

0.736

0.480

0.966

0.631

0.354

0.410

0.635

0.642

0.008

  1. The nominally best performing classifier on average over all datasets is marked with bold, and P values of methods whose performance cannot be deemed statistically worse than the nominally best performing method are marked with “*”. The accuracy of the nominally best performing method for each dataset is underlined.