To assess associations between immune markers and gut microbial dynamics in the context of immune reconstitution and clinical outcomes after HSCT, we monitored 37 children over time undergoing allogeneic HSCT (Fig. 1a, Additional file 1: Table S1). We gained insight into the patients’ immune reconstitution by determining T, B, NK, monocyte, and neutrophil cell counts in peripheral blood (Fig. 1a). We measured C-reactive protein (CRP) and plasma interleukin 6 (IL-6) as markers of inflammation, and human beta-defensin 2 and 3 (hBD2 and 3) as markers for potential innate immune activation and systemic infection (Fig. 1a, Additional file 1: Table S1). Plasma citrulline levels were measured as a marker of intestinal toxicity, being produced selectively by functioning enterocytes. To relate immune marker levels to members of the intestinal microbiota in patients undergoing HSCT, we characterized the longitudinal dynamics of the human intestinal microbiome in a subset of 30 patients by utilizing 16S rRNA gene profiling (Fig 1a).
Patient cohort and outcomes
At the time of HSCT, the 37 patients were on average 8.2 years old (age range 1.1–18.0 years). Twenty-five patients (68%) were diagnosed with at least one bacterial infection at median 75 days post-HSCT (range day − 19 to + 668). Twenty-six patients (70%) had no or mild aGvHD (grade 0–I) (Additional file 1: Table S1). Eleven patients (30%) developed moderate to severe aGvHD (grade II, III, or IV) at median 18 days (range day + 9 to + 45) after transplantation (Additional file 1: Table S1). Seven patients (19%) died during the follow-up period at median 266 days post-HSCT (range day + 9 to + 784) (five relapse-related and two treatment-related deaths) (Additional file 1: Table S1). In total, six patients (16%) relapsed, four of which underwent a re-transplantation. All patients received antibiotics pre- and post-transplantation (Additional file 1: Table S1). Prophylactic trimethoprim-sulfamethoxazole was administered to all patients from day − 7 until transplantation. During the period of neutropenia or latest from day − 1, patients received prophylactic intravenous ceftazidime. In case of infections indicated by fever or microbial culture, ceftazidime was substituted by intravenous meropenem, vancomycin, or other antibiotics, according to culture-based results.
Temporal dynamics of immune markers and the intestinal microbiota in HSCT patients
Prior to assessing the interplay between clinical variables (i.e., immune markers, immune reconstitution, clinical outcomes) and the intestinal microbiota, we characterized these components separately. In order to provide an overview of changes in immune markers and immune cell counts after HSCT in our cohort (n = 37), we assessed their temporal patterns (Fig. 1b, and Additional file 2: Figure S1). Of note, these supplemental univariate analyses mainly served the purpose of visually aiding our subsequent multivariate analyses approaches (Additional file 3: Figure S2). We characterized hBD2 for the first time in the context of HSCT by assessing plasma hBD2 concentrations over time from pre-HSCT to month + 3 post-HSCT in patients compared to healthy controls. The hBD2 concentration differed significantly between time points (P < 0.001, Kendall’s W = 0.6). It increased from pre-HSCT to the day of HSCT (P < 0.001), then slightly decreased in week + 1 (P = 0.038) before increasing again in week + 3 (P = 0.006) (Fig. 1b). HBD2 decreased again in month + 2 (P = 0.014) (Fig. 1b). CRP levels differed significantly between time points (P < 0.001, Kendall’s W = 0.33). They were high pre-HSCT and until week + 2, then decreasing significantly in week + 3 (P < 0.001) with the lowest levels in weeks + 4 to + 6 (P < 0.001) (Additional file 2: Figure S1A). Median plasma citrulline levels were significantly different between time points (P < 0.001, Kendall’s W = 0.32). They decreased from pre-HSCT to week + 1 (P < 0.001) and increased again in week + 3 (P < 0.001) (Additional file 2: Figure S1A). B cell counts (Kendall’s W = 0.5) as well as CD4+ T cell counts (Kendall’s W = 0.46) increased steadily from month + 1 to month + 6 (P < 0.001) (Additional file 2: Figure S1B).
To gain insight into intestinal microbial dynamics before, at the time of, and after HSCT, we obtained a total of 97 fecal samples from a subcohort of 30 patients. Using 16S rRNA gene sequence analysis, we identified 239 operational taxonomic units (OTUs) (see “Methods” section). Microbial alpha-diversity was lower at all time points post-HSCT compared to pre-HSCT (Fig. 1c). The median inverse Simpson index decreased from 3.27 (range 1.02–7.4) before HSCT to 2.89 (range 1.04–10.77) on the day of transplantation and further to 2.03 (range 1.0–16.51) post-HSCT (median of week + 1 to + 5) (Fig. 1c). When assessing the relative abundances of the eight most abundant taxonomic families over time, we observed a reduction in Lachnospiraceae and Ruminococcaceae abundances immediately after HSCT from 20.1 and 9.6%, respectively before HSCT to 6.1% and 1.7% on average in weeks + 1 to + 5 (Fig. 1d). In contrast, Enterococcaceae abundances rose from 17.4% before transplantation to 30.2% on average in weeks + 1 to + 5 (Fig. 1d).
Associations between immune markers and immune cell reconstitution in HSCT patients
In order to identify patient baseline parameters and clinical outcomes (e.g., aGvHD, relapse, overall survival), as well as immune markers and immune cell types that might be important determinants in HSCT in relation to the microbiome, we performed variable assessment by permutational multivariate analysis of variance (adonis). The variables that were found to be significant (P ≤ 0.05), i.e., those that explained most variation in the microbial community distance matrix, were selected for subsequent analyses (Additional file 3: Figure S2, Additional file 4: Table S2). Of note, the occurrence of relapse as an indication of transplantation outcome was assessed but not found to be significant in adonis, and was therefore not included in follow-up analyses. We then assessed associations of the selected immune markers and immune cells in the data set comprising 37 patients by determining Spearman’s rank correlations. The hBD2 concentrations pre-HSCT, on the day of HSCT and in weeks + 1 and + 2 post-HSCT were positively correlated with each other (ρ = 0.73–1, P < 0.001) (Fig. 2a). NK cell counts in month + 1 exhibited a positive correlation with total B cell counts (ρ = 0.64, P = 0.0046) and mature B cells counts (ρ = 0.62, P = 0.0114) in month + 2. When we related immune cell reconstitution to outcomes, we observed significantly higher NK cell counts and total B cell counts in month + 2 in patients with no or mild aGvHD (grade 0–I) compared with patients with moderate to severe aGvHD (grade II–IV) (Wilcoxon rank sum test, NK cells, P = 0.011; B cells, P < 0.001) (Fig. 2b).
High plasma hBD2 and monocytes prior to HSCT in patients with high Lactobacillaceae
To gain insight into how the selected immune markers and immune cell counts co-vary with gut microbial abundances in patients with distinct outcomes, we implemented two multivariate multi-table approaches for the subcohort (n = 30), namely sparse partial least squares (sPLS) regression and canonical correspondence analysis (CCpnA). The sPLS regression models OTU abundances as predictors and clinical variables as response variables and explains the latter in an asymmetric (i.e., unidirectional) way. In contrast, the CCpnA assesses relationships between parameters of the immune system and microbiota bidirectionally. In the following paragraphs, the results of these two analyses are reported for one observed cluster at a time, respectively.
First, we performed sPLS regression to reveal multivariate correlation structures between immune markers, immune cell counts, and OTU abundances, modeling the latter as explanatory variables. The sPLS regression and subsequent hierarchical clustering suggested that the data separated into three clusters (Fig. 3a). High monocyte counts and high plasma hBD2 concentration prior to HSCT, high patient age at the time of transplantation, and high abundances of Lactobacillaceae independent of time point contributed the most to the formation of cluster 1. Of note, monocyte counts and hBD2 were positively correlated with each other, in agreement with the correlation analysis above (Fig. 2a). The Lactobacillaceae were represented by microaerophilic Lactobacillus sp. OTUs (e.g., AF413523.1, GU451064.1, KF029502.1) (Fig. 3b, Additional file 5: Table S3). These OTUs exhibited high loading weights in sPLS dimension 2 (Fig. 3c), indicating that they contributed strongly to the separation of clusters in dimension 2 (Fig. 3a, b).
Second, we applied CCpnA to model the canonical relationships between OTU abundances and clinical variables through the construction of common “latent” variables. The CCpnA confirmed the separation of the data into three clusters as observed in the sPLS regression (Figs. 3a, 4, and Additional file 6: Figure S3A), including the clustering of OTUs. In addition, the CCpnA facilitated the inclusion of categorical variables, such as the patients’ baseline parameters (e.g., recipient sex, donor type) and clinical endpoints (aGvHD grade, overall survival). Because CCpnA is an unsupervised method and upheld the results of the sPLS regression, it provides confidence in the cluster findings.
Cluster 1 in the CCpnA seemed to include patients who developed moderate to severe aGvHD (grade II–IV) and who died. As suggested by both, sPLS and CCpnA, these patients exhibited high levels of plasma hBD2 before HSCT (and in week + 1 and + 2) and high monocyte counts before HSCT. OTUs within this CCpnA cluster predominantly included members of the family Lactobacillaceae and were most abundant in fecal samples of these patients (Fig. 4 and Additional file 7: Figure S4). OTUs that were assigned to cluster 1 by the sPLS-based hierarchical clustering were congruently associated with the same clinical variables in the CCpnA (Figs. 3b and 4).
Temporal patterns of the Lactobacillaceae-dominated community state type
Upon revealing an association between high plasma hBD2 concentrations and monocyte counts prior to HSCT, moderate to severe aGvHD, and high mortality with high abundances of Lactobacillaceae in multivariate analyses, we assessed in more detail the importance of longitudinal changes in these components. We implemented an additional approach to identify distinct bacterial community patterns by employing partitioning around medoid (PAM) clustering (see “Methods” section). In this analysis, we detected four community state types (CSTs). Dominating taxa, similar to those identified by the sPLS-based hierarchical clustering, were revealed in the CSTs, e.g., Lactobacillaceae members dominated CST 1 (Additional file 5: Table S3 and Additional file 8: Figure S5). We then used this information to examine temporal community state changes in individual patients, i.e., their transitions between CSTs over the course of time (Fig. 5). Based on clinical outcomes, the patients can be divided into four groups: (1) patients who developed no or mild aGvHD (aGvHD grade 0–I) and survived compared to (2) patients who died, and (3) patients who developed moderate to severe aGvHD (grade II–IV) and survived compared to (4) patients who died (Fig. 5). We observed that six out of eight patients (75%) with moderate to severe aGvHD (groups 3 and 4) harbored the Lactobacillaceae-dominated CST 1 at least once during the monitored period (1×: n = 2, 2×: n = 2, 3×: n = 2) (Fig. 5). In comparison, only five out of 22 patients (23%) with aGvHD grade 0–I (groups 1 and 2) carried CST 1 one or two times (Fig. 5). Interestingly, high abundances of Lactobacillaceae in patients with aGvHD grade II–IV (groups 3 and 4) occurred predominantly at late time points (week + 1 and later) (Fig. 5 and Additional file 7: Figure S4).
Temporal association of Lactobacillaceae with aGvHD and immune markers
To relate temporal changes in immune markers to those in the bacterial community composition in patients with aGvHD grade II–IV who died (group 4), we assessed their individual longitudinal profiles (Fig. 6). In all three patients (P24, P26, P30), a large expansion of Lactobacillaceae abundances after the onset of aGvHD was observed. The average relative abundance of Lactobacillaceae after aGvHD onset was 72.92% (range 0.22–97.04%), as compared to before aGvHD (average 9.88%, range 0.25–37.83%) (Fig. 6). Furthermore, bacterial alpha diversity was lower at the time point after aGvHD onset compared to the time point before (Fig. 6). All three patients were treated with antibiotics for different durations between these two time points and prior to the time point before aGvHD onset.
In agreement with the results of the multivariate analyses, two of the patients (P24 and P26) exhibited between 1.95- and 14.56-times higher plasma hBD2 concentrations for all measured time points compared with the average of the whole data set (average plasma hBD2 concentration: 10,983.9 pg/ml, range 0–177,400.28 pg/ml). The plasma hBD2 concentration was the highest at the time of HSCT in patient P26 and before HSCT in patient P24 (Fig. 6). Of note, patient P24 received corticosteroid-based GvHD prophylaxis prior to HSCT and both patients received anti-thymocyte globulin (ATG) as part of their conditioning regimen. These two patients had underlying malignant diseases and their pre-HSCT CRP levels, providing insight into underlying inflammation, were 1.6 times lower and 1.29 times higher, respectively, compared with the average of the whole data set on that time point (average plasma CRP concentration before HSCT: 19.21 mg/L, range 1–60.36 mg/L), respectively. Monocyte counts before HSCT, i.e., recipient-derived cells, were 6.25 times higher in patient P24 compared with the average of the whole data set (average monocyte count before HSCT: 0.45 × 109/L, range 0.08–2.83 × 109/L) and close to average in patient P26. In both patients, monocyte counts were higher before (i.e., recipient-derived cells) than after HSCT (i.e., donor-derived cells) (Fig. 6).
In contrast to the Lactobacillaceae expansion after aGvHD onset in group 4, we observed high Lactobacillaceae abundances already before aGvHD onset in three survivors (P16, P22, and P29) who developed aGvHD grade II–IV (group 3) (Additional file 9: Figure S6). The average relative abundance of Lactobacillaceae before aGvHD onset was 53.18% (range 0.35–98.86%), as compared to after aGvHD (average 18.25%, range 0–54.78%). However, these observations were limited by a small number of patients per outcome group (groups 3 and 4) (Fig. 5) and therefore cannot serve to provide statistical evidence.
High NK and B cells and no or mild aGvHD in patients with high obligate anaerobes such as Ruminococcaceae
The association between NK and total B cell counts after HSCT observed in the Spearman’s correlation analysis (Fig. 2) was supported by the multivariate analyses (Figs. 3 and 4). In the sPLS regression, cluster 2 included high NK cell counts in months + 1 and + 2, as well as high immature, mature, and total B cell counts in month + 2 (Fig. 3a). Furthermore, NK and B cell counts were positively correlated with OTUs found in cluster 2 (Fig. 3a, b), in particular with members of the family Ruminococcacea (Order: Clostridiales) (Fig. 3b, Additional file 5: Table S3). Strong positive correlations between mature and total B cell counts in month + 2 and NK cell counts in month + 1 were observed, for example, with Faecalibacterium sp. (DQ804549.1) (Fig. 3b, c, Additional file 5: Table S3). Additionally, we observed positive associations of these immune cells with Lachnospiraceae (Order: Clostridiales), among those two Blautia spp. (DQ800353.1 and DQ802363.1) (Fig. 3b, c, Additional file 5: Table S3).
In support of the sPLS regression, the CCpnA also revealed high NK and B cell counts in cluster 2 in months + 1 and + 2. Based on the CCpnA, we found that NK and B cell reconstitution was associated with certain disease outcomes. For example, patients who had no or mild aGvHD and survived were predominantly represented in cluster 2. Cluster 2 included OTUs mainly belonging to Ruminococcaceae and Lachnospiraceae, exhibiting their highest abundances in patient samples associated with this cluster. OTUs with the highest scores in dimension 2 predominantly belonged to the family of Ruminococcacea and matched the sPLS-based hierarchical cluster assignment (Fig. 4).
As all patients received antibiotic treatment prior to and post-transplantation, we examined the potential effect of antibiotics on the intestinal bacterial community composition. A trending influence of the vancomycin treatment was revealed by adonis analysis (P = 0.055). Using a CCpnA model that also included information about antibiotic treatments, we found that patients exhibited higher abundances of Ruminococcaceae and Lachnospiraceae and lower abundances of Enterobacteriaceae at time points without the vancomycin treatment compared to with vancomycin (Additional file 6: Figure S3B). The same pattern was observed for treatment with ciprofloxacin (Additional file 6: Figure S3B).
Community state typing also revealed a state type dominated by Ruminococcaceae and Lachnospiraceae family members (CST 2) (Additional file 8: Figure S5). A total of ten out of 22 patients (45%) with no or mild aGvHD were assigned to CST 2 at least at one time point (1×: n = 6, 2×: n = 3, 3×: n = 1) (Fig. 5). Half of the patients (four out of eight) with moderate or severe aGvHD were also assigned to CST 2 once, but only at the time point before transplantation (Fig. 5). That is, a Ruminococcaceae- and Lachnospiraceae-dominated community only persisted in patients with no or mild aGvHD.
Persistence of Enterococcaceae-dominated community state type
A subcluster of cluster 2 comprised two facultative anaerobe Enterococcus spp. (GQ1330038.1 and AJ272200.1), exhibiting positive correlations with high NK and B cell counts (Fig. 3b) and contributing to the separation of the clusters (Fig. 3c). Enterococcaceae was the most abundant family in the overall study population (Fig. 1d), and community state typing revealed a state type predominantly characterized by Enterococcaceae (CST 4, Additional file 8: Figure S5). A total of nine out of 22 patients (41%) with no or mild aGvHD were assigned to CST 4 at least at one time point (1×: n = 2, 2x: n = 2, ≥ 3×: n = 5) and this Enterococcaceae-dominated CST often persisted in the patient over time (Fig. 5). A quarter of the patients (two out of eight, both survivors) with moderate or severe aGvHD were assigned to CST 4 at least at one time point on the day of or post-HSCT (Fig. 5).
High inflammation in patients with high facultative anaerobic bacteria
In multivariate analyses, OTUs belonging to facultative anaerobic Enterobacteriaceae and Staphylococcaceae were characteristic for sPLS cluster 3 (Fig. 3a, b, Additional file 5: Table S3). Cluster 3 was further comprised of high plasma citrulline concentrations pre-HSCT (before conditioning) and week + 1, high monocyte counts in week + 3, high CD4+ T cell counts in month + 2, high CRP levels, particularly in weeks + 1, + 5, and + 6 (Fig. 3a, b). The projection of the variables in the sPLS suggested a weak positive correlation between these clinical variables in dimension 1 (Fig. 3a). However, a weak negative association between CRP levels and monocytes (week + 3) as well as citrulline and CD4+ T cell counts is indicated in dimension 2 (Fig. 3a), which is in agreement with some results of the Spearman’s rank correlation tests (Fig. 2).
The clinical variables in cluster 3, in particular high CRP levels post-HSCT, exhibited positive correlations with the OTUs predominantly affiliated with Proteobacteria (e.g., Enterobacteriaceae), Bacteroidetes, and Staphylococcus spp. (Fig. 3b, Additional file 5: Table S3). The strongest positive correlations occurred with several facultative anaerobe bacteria, e.g., Enterobacter sp. LCR81 (FJ976590.1), Escherichia coli (FJ950694.1), and Staphylococcus sp. (JF109069.1). The CCpnA supported observations from the sPLS regression regarding cluster 3, but indicated that they were only represented by a few patient samples (Fig. 4 and Additional file 6: Figure S3A). These samples were characterized by high CRP levels, especially in week + 1, + 5, and + 6. High CRP levels at these time points were not associated with aGvHD grade, i.e., they were not higher in patients with either aGvHD grade 0–I, or grade II–IV. OTUs of cluster 3, i.e., members of Enterobacteriaceae and Staphylococcaceae, exhibited their highest abundances in samples of this cluster (Fig. 4 and Additional file 6: Figure S3A). Of note, patients represented in cluster 3 were younger compared to those in the other two clusters (Fig. 4).