Skip to main content
Fig. 5 | Microbiome

Fig. 5

From: Gammaproteobacteria, a core taxon in the guts of soil fauna, are potential responders to environmental concentrations of soil pollutants

Fig. 5

Screening of gut microbiota that responded to pollutant pressure using a machine-learning method. The machine-learning modules of the random forest (RF), support-vector machine (SVM), and logistic regression (LR) analyses were built using the OTU data for the bacterial community (A). The associated AUC and ROC curves indicated that the RF model was the most accurate model. The RF module constructed using all samples at the phylum, class, order, family, genus, and OTU levels. Dashed lines indicates the 95% confidence interval (B). Prediction using the test data in the RF module. Black indicates the control, and blue indicates the treated groups in the comparison between raw (O) and predictive (P) information (C). Each test used a tenfold cross-validation method to verify the accuracy of the model predictions, and the 18 most abundant bacterial genera were identified by applying the RF classification of the relative abundances of the control and treated samples (D). RF classification of the relative abundance of the control and treated samples as based on the data for all genera for calculating their mean decreases in accurracy, combined with the frequency of each genus for screening the most important indicator genus, Paraburkholderia, belonging to Gammaproteobacteria (E). The associated relative abundance of Gammaproteobacteria in the control and the treated groups (F). P values were determined using two-tailed Welch’s t tests

Back to article page