Strain characterization by whole-genome sequencing, pangenome, phylogenetic reconstruction, and GWAS
The selected 95 S. aureus BI isolates and the 69 GI isolates were analyzed by whole-genome sequencing, showing that these isolates belonged to various STs. The 15 most commonly observed STs were similar for the BI and GI isolates, showing no significant differences (Supplementary Figure S 1 A). Among the BI isolates, the STs 5, 8, and 30 (9.47% each) were the most abundant STs, whereas the STs 15 and 45 (11.59% each) were most abundant among the GI isolates.
The number of resistance genes per sequenced isolate was low (average 0.87) with blaZ being most frequently represented with 5 different variants (Supplementary Figure S 2 B). The average number of identified virulence genes per strain was 62 (Supplementary Figure S 2 C). Twenty-six isolates were tsst-1 positive, and only one strain carried the lukF-PV and lukS-PV genes for the Panton-Valentine leukocidin. No significant differences between carriage versus bacteremia isolates were observed. The pangenome of the entire dataset from 164 S. aureus isolates comprised 5535 unique genes. These included 1976 core genes represented in all genomes, consistent with prior core genome estimations [44]. As shown in Fig. 1, the pangenome analysis was in accordance with the core genome clustering, representing several well-defined clonal complexes (CC) of S. aureus. These lineages were also identified using the annotation- and alignment-free PopPunk methodology that relies on variable-length-k-mer comparisons ([42]; Supplementary Figure S 2 A). Furthermore, the topology of the tree in Fig. 1 reflects the known population structure of S. aureus [15, 46, 87]. The BI isolates were dispersed throughout the tree, which is in accordance with the notion that invasive staphylococcal disease can arise from multiple genetic backgrounds. Interestingly, we detected no significant association between traits and genes using Scoary and Pyseer. This supports the idea that the genetic content of each S. aureus isolate (at least the core) is sufficient for the carriage and infection traits [46]. To further assess variations in the entire pangenome content, including intergenic regions, we performed an analysis using IGRs, SNPs, and unitigs. However, based on the results summarized in Supplementary Figure S 1, we conclude that the few suggested associations are insignificant or related to assembly artifacts. No unitigs were found to be significantly associated with carriage or infection.
Extracellular proteome analyses
Following the genome analysis of the strains, we asked the question whether the extracellular proteomes of representative GI and BI S. aureus isolates would show any informative variations that could be related to enteric carriage or infection because the exoproteome is the main reservoir of staphylococcal virulence factors [72]. To this end, we selected isolates belonging to CC1 and CC5, because such isolates were frequently encountered among the characterized GI and BI isolates (Fig. 1; Supplementary Table 1A) and because of the global importance of the respective CC’s. Among the CC1 and CC5 isolates, we selected isolates with the most common spa types, which were t127 for CC1 and t002 for CC5. The selected S. aureus strains were cultured in RPMI medium until stationary phase (OD600 of approximately 1.2), and the proteins in the growth medium were subsequently identified and quantified by MS. RPMI medium was used, because we have previously shown that the global transcript profiles of bacteria grown in human plasma or RPMI are highly similar [48]. Furthermore, extracellular proteins were collected in the stationary phase, because the majority of virulence factors are produced during this growth phase [57]. Interestingly, as shown by LDS-PAGE, the banding patterns of extracellular proteins and their relative intensities were distinct, even for isolates with the same sequence type (ST1 or ST5) (Supplementary Figure S 4 A).
Specifically, the MS analysis of the extracellular proteome identified a total number of 894 proteins, of which 234 proteins were shared between all strains (Supplementary Figure S 3 A). The ST5 isolates shared 264 extracellular proteins amongst each other, whereas the ST1 isolates shared 431 extracellular proteins (Supplementary Figure S 3 B). Furthermore, the BI isolates shared 283 extracellular proteins amongst each other, and the GI isolates 330 (Supplementary Figure S 3 C). The numbers of uniquely identified extracellular proteins also varied for the different strains, irrespective of the sequence type, or site of isolation (Supplementary Figure S 3).
For all identified extracellular proteins, we verified the predicted subcellular localization using the “Gram Positive Protein Prediction Pipeline” GP4 [23]. This revealed that the largest number of identified proteins belong to the so-called class of extracellular cytoplasmic proteins (ECPs), which lack known targeting signals for export from the cytoplasm (Fig. 2A). The number of ECPs was lowest for the BI-ST5-1 isolate and highest for the GI-ST1-9 isolate. The number of extracellular proteins with predicted signal peptides for export via the general secretory (Sec) pathway was around 50 per isolate with a total protein number of 75 signal peptide-bearing extracellular proteins being identified for all investigated isolates. These included 29 predicted lipoproteins and 12 predicted cell wall-associated proteins. Additionally, 4 proteins were predicted to reside at multiple subcellular locations. The high numbers of ECPs identified for the investigated S. aureus strains is in accordance with other extracellular proteome studies, which showed that the bacterial “exoproteome” may contain an extensive amount of ECPs [22, 28, 88, 89]. Several identified ECPs belong to the class of “moonlighting proteins” with distinct roles at multiple cellular and extracellular locations, including motility, biofilm formation, host invasion, immunomodulation, and platelet aggregation (Supplementary Fig. 4 C) [22, 28].
A principal component analysis (PCA) based on the LFQ intensities of identified extracellular proteins was performed to elucidate the extracellular proteome relationships among the investigated isolates with different STs and sites of isolation (Fig. 2B). This revealed a high degree of heterogeneity between the isolates, even if they had the same sequence type or were collected from the same site of isolation. Furthermore, the isolates BI-ST5-1, BI-ST1-8, and GI-ST5-4 clustered together, as was the case for the GI-ST5-6 and GI-ST1-9 isolates, whereas the GI-ST1-7 isolate was the most distinct in terms of identified extracellular proteins. To further compare the strains, we generated a heatmap based on the LFQ intensities of the identified extracellular proteins, which further highlights the expression heterogeneity observed for the different proteins, which was independent of the sequence type or isolation site (Fig. 2D).
Our proteome analysis identified 47 proteins that play known roles in the virulence of S. aureus. The relative abundance of these proteins per strain is presented in a heatmap (Fig. 2C), and this analysis revealed that the expression of these virulence factors is quite heterogeneous when considering the strains of the same sequence type or from the same isolation site. In fact, only sixteen core virulence factors are expressed by all the investigated strains. The remaining virulence factors show a heterogeneous expression pattern, which is strain-specific. Of note, the presence of the genes for all 47 identified virulence factors was verified using the whole-genome sequences of the six study isolates, which showed that 45 of these genes are present in all the six isolates, while some isolates lack the cna (BI-ST5-1, GI-ST5-4, GI-ST5-6) and/or the chp genes (GI-ST5-6, GI-ST1-9) (Supplementary Fig. 2 D). The identified virulence factors include proteins with roles in iron acquisition and adhesion to human host cells (IsdB, IsdC IsdE, IsdH), proteins belonging to the “microbial surface components recognizing adhesive matrix molecules” (MSCRAMM) family (ClfB, CNA, EbpS, Emp, Fib, FnbB, Map, and SpA) and sortase enzymes (SrtA and SrtB). Other identified virulence factors are secreted proteins that serve to disrupt host cells and promote spreading, including exoenzymes (Aur, Coa, Lip1, Lip2, Sak, vWpb), proteases (SspA and SspP), and exotoxins (EntA, Hla, Hld, HlgA, HlgB, HlgC, LukDv, LukEv, SA1812, SA1813, SEA, SElX, SSL11, SSL13, SSL4, SSL5, and SSL7). We also identified proteins which are implicated in the evasion of innate or adaptive immune responses of the host (CHIPS, FLIPr, IsaB, Nuc, Sbi, SAOUHSC_01115, SCIN) and a membrane-associated protein of the Type VII secretion system (EsaA). Interestingly, one bloodstream isolate (BI-ST1-8) secreted more known virulence factors compared to the other isolates, while one gut isolate (GI-ST1-9) apparently secreted the lowest number of known virulence factors in comparison to the other isolates. When comparing the secreted virulence factors per sequence type of the isolates, we observed that some virulence factors (collagen-binding protein “CNA” and the von Willebrand binding protein “vWpb”) were present only in the isolates with ST1. On the other hand, the ST5 isolates secreted the staphylococcal α-toxin Hla, which was not identified for the ST1 isolates. None of the extracellular proteins was identified exclusively in gut or bacteremia isolates.
To determine the overall extracellular proteome functions of the six investigated S. aureus isolates, we used the TIGRfam and Aureowiki annotations for the functional classification of the 894 identified proteins [16, 24]. To this end, the proteins were divided into seven top-level functional categories and sixteen sub-level functional categories, as shown in so-called Voronoi treemaps (Fig. 3A, B). In these treemaps, the functional categories are marked in color code and the size of the cells is proportional to the number of identified proteins belonging to the respective category. For the top-level functional categories, the most representative group is composed of proteins with roles in “metabolism” (31.1%), while the remaining groups include proteins involved in “genetic information and processing” (26.6%), “cellular processes” (6.8%), “cell structure” (4.80%), “signal transduction” (4.1%), “phages, prophages, transposable elements, and plasmids” (0.45%) and proteins with unknown function (26.1%) (Fig. 3A). Additionally, the identified extracellular proteins of the six investigated S. aureus isolates were compared either by grouping the strains per sequence type (ST5 versus ST1) or per isolation site (GI versus BI). In the respective Voronoi treemaps, each protein is represented by a polygon-shaped tile and its relative abundance is presented based on the log2-transformed LFQ intensities per sequence type (ST5 vs ST1) (Fig. 3C) or isolation site (BI vs GI) (Fig. 3D). Overall, when comparing the identified extracellular proteins per sequence type (ST5 versus ST1) or isolation site (GI versus BI), the proteins belonging to the top-level and the sub-level functional categories do not show significant differences or particular trends. Furthermore, we inspected the unique “ON/OFF” extracellular proteins, which were present (“ON”) in all replicates of one group or absent (“OFF”) from all replicates of the other group. These proteins are Hla, Cna, CysS, GatD, HchA, PepT, PnpA, Vwbp, Ssl11, and SAOUHSC_ (01110, 00094, 00422, 00555, 00603, 01594, 01987, 02447) when comparing the identified proteins per sequence type (i.e., “ON” in ST5/ “OFF” in ST1). The “ON/OFF” proteins compared per isolation site were BlaR1 and BlaZ (“ON” in BI/ “OFF” in GI).
Cellular proteome analyses
Following the genomic and extracellular proteome characterizations of the six selected S. aureus study isolates, we investigated to what extent their cellular proteomes differ. To this end, the strains were cultivated in RPMI medium until the stationary phase (OD600 of approximately 1.2). Subsequently, the cells were separated from the growth medium by centrifugation, and the cellular proteins were extracted and identified by MS. This resulted in the identification of 1235 proteins in total, with 610 proteins shared by all isolates (Fig. 4A and Supplementary Figure S 5 A). In particular, the ST5 isolates shared 735 cellular proteins, while the ST1 isolates shared 708 proteins (Supplementary Figure S 5 B). Further, the BI isolates shared 700 proteins, and the gut isolates shared 638 proteins (Supplementary Figure S 5 C). Different numbers of unique cellular proteins were identified for each isolate, irrespective of sequence type or isolation site (Supplementary Figure S 5).
A PCA based on the LFQ intensities of identified cellular proteins was performed to elucidate possible differences between isolates with different sequence types or isolation sites (Fig. 4B). This revealed heterogeneity between isolates with the same sequence type as the three respective ST1 or ST5 isolates did not cluster together. The same phenomenon was observed when the comparison was done per isolation site. In particular, the GI-ST5-4, GI-ST5-6, and BI-ST1-8 isolates cluster next to each other in the PCA, while the BI-ST5-1 and GI-ST1-9 isolates form another cluster and the GI-ST1-7 isolate is distantly positioned in the PCA space. The conclusion that the different isolates are heterogeneous irrespective of sequence type or isolation site is also evident from the heatmap with LFQ intensities of the identified cellular proteins (Fig. 4C).
The overall cellular proteome functions of the six investigated S. aureus isolates were evaluated using the TIGRfam and Aureowiki annotations for the 1235 identified proteins [16, 24]. This allowed the distinction of seven top-level and eighteen sub-level functional categories, as shown by Voronoi treemaps where they are marked in color-coded cells that are proportional in size to the number of identified proteins (Fig. 5A, B. The most representative top-level functional group includes proteins with roles in “metabolism” (37.8%, while the following groups include proteins involved in “genetic information and processing” (24.8%; “signal transduction” (6.0%; “cellular processes” (5.9%; “cell structure” (4.9%; “phages, prophages, transposable elements, and plasmids” (0.3%; and proteins with unknown function (20% (Fig. 5A). Overall, when comparing identified cellular proteins per sequence type (ST5 versus ST1) or isolation site (GI versus BI), these proteins belonging to the top-level and the sub-level functional categories do not show significant differences or particular trends (Fig. 5C, D). Additionally, we compared the relative quantities of identified cellular proteins per sequence type (ST5 versus ST1), revealing merely 10 proteins with statistically significant increased abundance in ST5 isolates (Dat, FarR, RecA, SA1975.1, RbsK, D7S40_10290, SAV2523, AroA, AroA_1, and GlyA). More importantly, a comparison per isolation site (BI versus GI) revealed no statistically significant differences in cellular protein abundance between GI and BI isolates. We also inspected the unique “ON/OFF” proteins, which included the SfaD, YutE, RplS, FumC, and Pgi proteins in comparisons per sequence type (“ON” in ST5/ “OFF” in ST1). On the other hand, in comparisons of identified cellular proteins per isolation site (BI versus GI), no unique “ON/OFF” proteins were identified.
S. aureus is well known for its great adaptability to different environments by responding to different stimuli. To this end, the bacterium makes use of a range of transcriptional regulators that determine the expression of virulence factors and/or particular metabolic pathways [33, 74]. Here, it should be noted that many staphylococcal genes are actually controlled by multiple regulators. To appreciate the observed differences in protein abundance, we also prepared Voronoi treemaps and heatmaps in which the charted proteins are attributed to the known staphylococcal regulons (Supplementary Figure S 4 and S 6) [55, 56]. In particular, to detect potential metabolic adaptations in the six S. aureus isolates in relation to their epidemiology, we categorized the identified proteins according to their involvement in different metabolic pathways, e.g., central carbon metabolism, amino acid metabolism, alternative carbon sources, and respiration as previously described (Palma [51, 63]. The central carbon metabolism, gluconeogenesis, the pentose phosphate pathway, and the tricarboxylic acid (TCA) pathway are essential for S. aureus both outside and within host cells [19]; Palma [63]. Additionally, S. aureus may have to compete for nutrients when interacting with the human host, which forces S. aureus to use amino acids as carbon and nitrogen sources [25], Palma [63], or alternative carbon sources, such as glycerol. Intracellularly, basic cellular functions of S. aureus, relating to oxidative phosphorylation, were previously shown to be adjusted based on the availability of oxygen, leading to the employment of alternative metabolic pathways like fermentation [17]. However, as evidenced by the heatmaps of log2-transformed LFQ intensities (Supplementary Figure S 6 B), we did not detect any significant distinctive metabolic adaptations by comparing the cellular proteins per sequence type (ST5 versus ST1) or isolation site (BI versus GI). This implies that, from a metabolic perspective, the behavior of the different GI and BI isolates was comparable upon growth in the RPMI medium. Importantly, this conclusion is fully supported by analyzing the cellular protein abundances per regulon, revealing no statistically significant differences for particular regulons per sequence type or site of isolation (Supplementary Figure S 6 C and D). Likewise, no statistically significant differences for particular regulons were detectable per sequence type or site of isolation upon regulon-based stratification of the identified extracellular proteins (Supplementary Figure S 4 D).
Since no systematic differences could be observed between BI and GI isolates in terms of protein expression in vitro, we asked the question of whether some of these strains would display different virulence profiles upon infection of human cells. For this purpose, we compared their infectious behavior in human gut epithelial cells and blood cells.
Infection of Caco2 gut epithelial cells
To date, little is known about possible interactions between enteric S. aureus bacteria and gut epithelial cells, which are likely to be decisive for the transition from the gut-resident to the pathogenic state of this bacterium [67]. Therefore, we established an infection model that is based on monolayers of Caco2 enterocytes, which are simple columnar epithelial cells that line the inner surface of the small and large intestines. Prior to infection, the Caco2 cells were seeded in 24-well plates at a density of 200,000 cells per well and cultured for 84 h prior to infection. This seeding condition was monitored over time by confocal fluorescence microscopy, showing the presence of a monolayer of cells with tight junctions as visualized with antibodies against the ZO-1 protein. Importantly, the Caco2 cells formed monolayers with tight cell–cell junctions at the cellular contact sites (Fig. 6A), mimicking a closed epithelial barrier. Caco2 monolayers were infected with the different S. aureus BI and GI isolates for 3 h at a MOI of 30 (Fig. 6C, D), followed by a 30-min incubation with lysostaphin to eliminate non-internalized bacteria. As shown by flow cytometry, only very few bacteria invaded the Caco2 cells, with GFP + Caco2 cells ranging between 4 and 10% (Fig. 6E). Since the Caco2 cell invasion appeared quite homogenous, but low compared to previously investigated endothelial and lung epithelial cell infection models, [62, 63, 68], we investigated whether these low infection levels could be due to a relatively high percentage of bacteria adhering to the cells. To this end, we performed infection experiments where the non-internalized bacteria were not eliminated with lysostaphin, revealing a high percentage of GFP + Caco2 that ranged from ∼35 to 60% (Fig. 6F). The isolates showing the highest Caco2 cell adherence and invasion were the bloodstream isolate BI-ST5-1 and the gut isolate GI-ST5-6. The remaining four isolates showed comparable adhesion to and invasion of Caco2 cells, irrespective of their isolation site or sequence type. This suggested that the tight barrier formed by the Caco2 cells might set a limit to infection. To test this idea, a third Caco2 infection model was established, which mimics a disrupted and subsequently regenerated gut barrier. In this model, the cell–cell junctions of a Caco2 monolayer were temporarily disrupted by a 45-min treatment with EGTA in the absence of calcium (Fig. 6B). Importantly, following the removal of the EGTA, the tight junctions started to be restored after 1 h (Fig. 6C) and after 3 h, the tight junctions were almost completely restored (Fig. 6D). To evaluate the importance of an intact barrier for Caco2 cell infection, we treated the cells with EGTA for 45 min, removed the EGTA by washing, and performed an infection experiment for 3 h. Indeed, following this procedure, we observed a steep increase in the number of bacteria internalized by the Caco2 cells, with GFP + Caco2 cells ranging from ∼35 to 70% (Fig. 6G). The low percentage of GFP + infected Caco2 cells in the monolayer with intact tight junctions (Fig. 6H) compared to the disrupted monolayer after EGTA treatment (F ig. 6I) was also visualized by confocal fluorescence microscopy. Together, these findings demonstrate the importance of a tightly closed monolayer of Caco2 cells in preventing S. aureus infection, whereas disrupted cell–cell junctions in the monolayer permit the bacteria to readily enter these gut epithelial cells.
Leukocyte killing and intracellular survival
To gain further insights into the virulence of the six S. aureus study isolates once they have entered the bloodstream, we established a model that mimics the interactions between S. aureus and blood cells. To this end, we focused our analysis on leukocytes, which are among the first responders once infecting bacteria reach the bloodstream. Moreover, these cells are also present in different human tissues as, for example, the mucosal gut epithelium and the gut lamina propria. The six S. aureus isolates were cultured until the stationary phase (OD600 of approximately 1.2) and used to infect blood cells collected from healthy volunteers at a MOI of 15. Of note, prior to infection the red blood cells were lysed and removed. Specifically, the bacteria were allowed to interact with the blood cells for 30 min and, thereafter, non-internalized bacteria were eliminated by a 30-min incubation with lysostaphin. Then, the proportion of live blood cells was assessed by flow cytometry (Fig. 7 and Supplementary Figure S 7). As expected, compared to the uninfected control, the percentage of living blood cells decreased upon infection (Fig. 7A). Interestingly, in terms of blood cell killing (around 20 to 30% killing), we observed no difference between the two BI isolates (BI-ST5-1 and BI-ST1-8) and two of the GI isolates (GI-ST5-6 and GI-ST1-9). Moreover, the gut isolates GI-ST5-4 and GI-ST1-7 showed around 40 to 50% killing of blood cells, which means that they are more virulent than the two bacteremia isolates. Of note, these numbers take into account the killing of both monocytes and granulocytes by the infecting bacteria. To assess possible differences in bacterial internalization and intracellular survival in granulocytes, we also measured the percentage of GFP + granulocytes and GFP − granulocytes after infection (Fig. 7B). The granulocyte population is mainly composed of neutrophils which, in the human body, continuously migrate through the tissues and are the first responders to bacterial infection. Specifically, the GFP + granulocytes represent those granulocytes that contain intracellular GFP-expressing S. aureus, while the GFP-negative granulocytes do not contain bacteria. Infection with one of the BI isolates (BI-ST1-8) led to the highest percentage of GFP + granulocytes, whereas the lowest percentage of GFP + granulocytes was observed for the GI isolates GI-ST5-4 and GI-ST1-7. Altogether, we conclude that the investigated isolates show differing infectious behavior towards human leukocytes that cannot be correlated with their site of isolation or sequence type. Importantly, these observations show that the bacteremia isolates are not necessarily more virulent than the enteric isolates, and they imply that GI isolates may be more pathogenic than isolates that had actually caused an invasive infection in patients.
Infection of Galleria mellonella larvae
To complement the above infection experiments with human gut epithelial and blood cells with a small animal in vivo model, we infected larvae of the wax moth G. mellonella with the six S. aureus study isolates. Notably, the G. mellonella model adheres to the principles of replacement, reduction, and refinement (“3Rs”) and potentially reduces the numbers of vertebrates used for experimental infection studies. In this infection model, the bacteria are challenged primarily by the innate immune system of the larvae. The S. aureus isolates were cultivated until the stationary phase (OD600 of approximately 1.2), and 10 μL of aliquots of each bacterial isolate (1 × 108 CFU/ml) was used to inoculate 45 larvae. The % mortality of G. mellonella was subsequently assessed at 24 h, 48 h, 72 h, and 96 h p.i. As shown in Fig. 7C, the different S. aureus isolates displayed heterogeneity in larval killing that could not be correlated with their site of isolation or sequence type (Fig. 7C). In fact, infection with the two BI isolates BI-ST5-1 and BI-ST1-8 resulted in a comparable larval killing as was observed for the two enteric isolates GI-ST5-6 and GI-ST1-7, whereas the enteric isolates GI-ST5-4 and GI-ST5-9 were less virulent in larval infections. This finding was conserved over most time points p.i., although the bacteremia isolates tended to be slightly more virulent in the first 24 h p.i. (Fig. 7C). On the other hand, the isolate that caused the highest larval mortality at 96 h p.i. was the gut isolate GI-ST5-6. Therefore, we conclude that, also in the G. mellonella infection model, the virulence of our six study isolates cannot be correlated to their site of isolation or sequence type.