Skip to main content

Table 6 Classification accuracy with feature/operational taxonomic unit (OTU) selection, measured by proportion of correct classifications (PCC)

From: A comprehensive evaluation of multicategory classification methods for microbiomic data

Classifier

Best FS Method

CBH

CS

CSS

FS

FSH

BP

PDX

PBS

Averages

P values

SVM, Linear C = 1

SVM-RFE

0.900

0.941

0.610

0.965

0.719

0.524

0.558

0.759

0.747

0.319*

SVM, Linear optimized C

SVM-RFE

0.952

0.935

0.631

0.985

0.754

0.534

0.553

0.761

0.763

0.535*

SVM, Poly

SVM-RFE

0.950

0.929

0.633

0.987

0.742

0.528

0.551

0.754

0.759

0.460*

SVM, RBF

SVM-RFE

0.941

0.918

0.617

0.987

0.693

0.518

0.523

0.727

0.741

0.179*

KRR, Poly

KW

0.909

0.933

0.623

0.949

0.749

0.547

0.514

0.713

0.742

0.199*

KRR, RBF

KW

0.929

0.939

0.634

0.970

0.737

0.537

0.504

0.714

0.745

0.248*

KNN, K = 1

RFVS2

0.930

0.760

0.563

0.971

0.623

0.421

0.443

0.596

0.663

0.011

KNN, K = 5

RFVS2

0.930

0.724

0.529

0.943

0.656

0.434

0.434

0.609

0.657

0.009

KNN, optimized K

RFVS2

0.935

0.754

0.552

0.963

0.648

0.422

0.432

0.620

0.666

0.011

PNN

RFVS2

0.906

0.781

0.560

0.956

0.623

0.130

0.449

0.604

0.626

0.006

L2-LR, C = 1

ALL

0.934

0.939

0.628

0.982

0.628

0.380

0.515

0.725

0.716

0.047

L2-LR, optimized C

KW

0.921

0.948

0.650

0.836

0.739

0.499

0.464

0.711

0.721

0.089*

L1-LR, C = 1

RFVS1

0.922

0.818

0.589

0.968

0.706

0.449

0.395

0.687

0.692

0.020

L1-LR, optimized C

RFVS1

0.934

0.909

0.611

0.993

0.710

0.442

0.418

0.697

0.714

0.048

RF, default

RFVS1

0.954

0.950

0.704

0.991

0.745

0.550

0.479

0.746

0.765

-

RF, optimized

RFVS1

0.954

0.950

0.695

0.996

0.746

0.548

0.479

0.741

0.764

0.498*

BLR, Laplace priors

SVM-RFE

0.946

0.929

0.639

0.991

0.759

0.521

0.537

0.739

0.758

0.465*

BLR, Gaussian priors

KW

0.926

0.856

0.557

0.980

0.728

0.525

0.426

0.701

0.713

0.043

  1. The nominally best performing classifier on average over all datasets is marked with bold, and P values of methods whose performance cannot be deemed statistically worse than the nominally best performing method are marked with “*”. The accuracy of the nominally best performing method for each dataset is underlined.