Skip to main content

Table 2 Performance evaluation of different tools to predict prophage locations (start/end sites)

From: High throughput sequencing provides exact genomic locations of inducible prophages and accurate phage-to-host ratios in gut microbial strains

Experimental data

Prediction

Evaluation

Prophage (range/length—including att sites)

Tool

Deviation from

Length

Precision

Recall

F-score

Start

End

S. Tm LT2p22 p22 (1213987-1255756/41770)

PHASTER

60

1126

42836

0.97

1.00

0.99

Prophage Hunter

−13077

8273

63120

0.66

1.00

0.80

VIBRANT

60

−334

41376

1.00

0.99

1.00

Virsorter

10

1152

42912

0.97

1.00

0.99

S. Tm LT2p22 Fels-1 (1849458-1892188/42731)

PHASTER

−1528

−3284

40975

0.96

0.92

0.94

Prophage Hunter

−4528

7083

54342

0.79

1.00

0.88

VIBRANT

41

1216

43906

0.97

1.00

0.99

Virsorter

−20661

22091

85483

0.50

1.00

0.67

S. Tm LT2p22 Fels-2 (3731215-3764954/33740)

PHASTER

0

1410

35150

0.96

1.00

0.98

Prophage Hunter

NA

NA

NA

NA

NA

NA

VIBRANT

449

10659

43950

0.76

0.99

0.86

Virsorter

−115370

79199

228309

0.15

1.00

0.26

YL44 phage 1 (969482-1008291/38810)

PHASTER

NA

NA

NA

NA

NA

NA

Prophage Hunter

−9199

−2567

45442

0.80

0.93

0.86

Prophage Hunter

23847

3354

18317

0.82

0.39

0.52

VIBRANT

186

−9968

28656

1.00

0.74

0.85

Virsorter

1436

138

37512

1.00

0.96

0.98

Virsorter

8087

138

30861

1.00

0.79

0.88

YL44 phage 2 (2566025-2602918/36894)

PHASTER

NA

NA

NA

NA

NA

NA

Prophage Hunter

−4041

−23277

17658

0.77

0.37

0.50

Prophage Hunter

704

−2995

33195

1.00

0.90

0.95

Prophage Hunter

7638

10625

39881

0.73

0.79

0.76

VIBRANT

−10070

−3356

43608

0.77

0.91

0.83

Virsorter

−25233

44693

106820

0.35

1.00

0.51

YL58 phage 1 (128601-174693/46093)

PHASTER

NA

NA

NA

NA

NA

NA

Prophage Hunter

−2929

−27047

21975

0.87

0.41

0.56

Prophage Hunter

139

−49

45905

1.00

1.00

1.00

VIBRANT

153

3341

49281

0.93

1.00

0.96

Virsorter

−1242

33898

81233

0.57

1.00

0.72

E. coli HS phage 1 (916812-948289/31478)

PHASTER

78

3104

34504

0.91

1.00

0.95

Prophage Hunter

NA

NA

NA

NA

NA

NA

VIBRANT

461

4880

35897

0.86

0.99

0.92

Virsorter

2787

−41

28650

1.00

0.91

0.95

YL31 phage 1 (1009760-1046747/36988)

PHASTER

−6510

−17

43481

0.85

1.00

0.92

Prophage Hunter

−8493

−9478

36003

0.76

0.74

0.75

VIBRANT

−13474

−185

50277

0.73

0.99

0.84

Virsorter

−43160

4955

85103

0.43

1.00

0.61

YL31 phage 2 (2984604-3026727/42124)

PHASTER

NA

NA

NA

NA

NA

NA

Prophage Hunter

0

0

42124

1.00

1.00

1.00

Prophage Hunter

10596

4114

35642

0.88

0.75

0.81

VIBRANT

−828

−914

42038

0.98

0.98

0.98

Virsorter

−1257

983

44364

0.95

1.00

0.97

KB18 phage 1 (2602856-2642096/39241)

PHASTER

−5132

7663

52036

0.75

1.00

0.86

Prophage Hunter

3156

15738

51823

0.70

0.92

0.79

VIBRANT

NA

NA

NA

NA

NA

NA

Virsorter

−45270

5505

90016

0.44

1.00

0.61

  1. The accuracy of prediction tools to locate prophages was evaluated by comparing the predicted start and end coordinates to the corresponding data obtained by experimental validation. The assessment includes the recall, precision and F-score of correctly predicting prophage genomic coordinates. The first column reports the range and length of each prophage based on experimental data as determined by analysis of clipped reads. Columns two to five summarise the results of the different prediction tools. The deviations from the validated start and end positions are shown with negative and positive numbers if the predicted site was upstream or downstream of the true start or end site, respectively. For each prophage, the tool with the best F-score is highlighted in bold. In the case of a tie, both tools with the highest F-score are highlighted. Prophages that were not categorised as ‘active’ (Prophage Hunter), ‘complete’ (PHASTER), ‘category 1’ or ‘category 2’ (VirSorter), or not predicted at all, are reported as NA