Skip to main content

Table 1 Performance of GRAViTy as evaluated by threefold cross-validation analysis

From: The genomic underpinnings of eukaryotic virus taxonomy: creating a sequence-based framework for family-level virus classification

Sub-pipeline

‘Known’ viruses1

‘Unknown’ virus2

n

Assigned to the correct group

Assigned to a wrong group

Assigned as ‘unknown’

n

Assigned as ‘unknown’

Assigned to an existing group

Group I: dsDNA virus

CV1

192

189

98.44%

0

0.00%

3

1.56%

1117

1117

100.00%

0

0.00%

CV2

194

188

96.91%

1

0.52%

5

2.58%

1117

1117

100.00%

0

0.00%

CV3

192

190

98.96%

0

0.00%

2

1.04%

1124

1124

100.00%

0

0.00%

Overall

–

–

98.10%

–

0.17%

–

1.73%

–

–

100.00%

–

0.00%

Group II: ssDNA virus

CV1

369

369

100.00%

0

0.00%

0

0.00%

940

939

99.89%

1

0.11%

CV2

371

369

99.46%

0

0.00%

2

0.54%

940

939

99.89%

1

0.11%

CV3

370

370

100.00%

0

0.00%

0

0.00%

946

945

99.89%

1

0.11%

Overall

–

–

99.82%

–

0.00%

–

0.18%

–

–

99.89%

–

0.11%

Group III: dsRNA virus

CV1

69

68

98.55%

0

0.00%

1

1.45%

1240

1233

99.44%

7

0.56%

CV2

70

67

95.71%

0

0.00%

3

4.29%

1241

1232

99.27%

9

0.73%

CV3

69

67

97.10%

0

0.00%

2

2.90%

1247

1239

99.36%

8

0.64%

Overall

–

–

97.12%

–

0.00%

–

2.88%

–

–

99.36%

–

0.64%

Group IV: (+)ssRNA virus

CV1

415

415

100.00%

0

0.00%

0

0.00%

894

891

99.66%

3

0.34%

CV2

412

411

99.76%

1

0.24%

0

0.00%

899

897

99.78%

2

0.22%

CV3

415

412

99.28%

1

0.24%

2

0.48%

901

896

99.45%

5

0.55%

Overall

–

–

99.68%

–

0.16%

–

0.16%

–

–

99.63%

–

0.37%

Group V: (−)ssRNA virus

CV1

176

176

100.00%

0

0.00%

0

0.00%

1133

1130

99.74%

3

0.26%

CV2

177

177

100.00%

0

0.00%

0

0.00%

1134

1132

99.82%

2

0.18%

CV3

180

179

99.44%

0

0.00%

1

0.56%

1136

1135

99.91%

1

0.09%

Overall

–

–

99.81%

–

0.00%

–

0.19%

–

–

99.82%

–

0.18%

Groups VI and VII: RT virus

CV1

47

47

100.00%

0

0.00%

0

0.00%

1262

1262

100.00%

0

0.00%

CV2

46

46

100.00%

0

0.00%

0

0.00%

1265

1265

100.00%

0

0.00%

CV3

49

49

100.00%

0

0.00%

0

0.00%

1267

1267

100.00%

0

0.00%

Overall

–

–

100.00%

–

0.00%

–

0.00%

–

–

100.00%

–

0.00%

Overall

–

–

99.09%

–

0.06%

–

0.86%

–

–

99.78%

–

0.22%

  1. 1Known in the sense that members of the family were in the reference dataset and that viruses in the same family in the test dataset should be classifiable
  2. 2Unknown in the sense that no members of the family were in the reference dataset, and therefore, viruses of that family in the test dataset should not be assigned