Perturbed human sub-networks by Fusobacterium nucleatum candidate virulence proteins
© The Author(s). 2017
Received: 28 October 2016
Accepted: 13 July 2017
Published: 10 August 2017
Fusobacterium nucleatum is a gram-negative anaerobic species residing in the oral cavity and implicated in several inflammatory processes in the human body. Although F. nucleatum abundance is increased in inflammatory bowel disease subjects and is prevalent in colorectal cancer patients, the causal role of the bacterium in gastrointestinal disorders and the mechanistic details of host cell functions subversion are not fully understood.
We devised a computational strategy to identify putative secreted F. nucleatum proteins (FusoSecretome) and to infer their interactions with human proteins based on the presence of host molecular mimicry elements. FusoSecretome proteins share similar features with known bacterial virulence factors thereby highlighting their pathogenic potential. We show that they interact with human proteins that participate in infection-related cellular processes and localize in established cellular districts of the host–pathogen interface. Our network-based analysis identified 31 functional modules in the human interactome preferentially targeted by 138 FusoSecretome proteins, among which we selected 26 as main candidate virulence proteins, representing both putative and known virulence proteins. Finally, six of the preferentially targeted functional modules are implicated in the onset and progression of inflammatory bowel diseases and colorectal cancer.
Overall, our computational analysis identified candidate virulence proteins potentially involved in the F. nucleatum—human cross-talk in the context of gastrointestinal diseases.
KeywordsFusobacterium nucleatum Secretome Molecular mimicry Short linear motifs Bioinformatics Interaction network Colorectal cancer Inflammatory bowel diseases Virulence proteins
Fusobacterium nucleatum is a gram-negative anaerobic bacterium best known as a component of the oral plaque and a key pathogen in gingivitis and periodontitis . It has also been isolated in several inflammatory processes in distinct body sites (e.g., endocarditis, septic arthritis, liver and brain abscesses) and implicated in adverse pregnancy outcomes (reviewed in ). Moreover, it has been demonstrated that F. nucleatum can adhere to and invade a variety of cell types, thereby inducing a pro-inflammatory response [3–8]. Recent work showed that (i) F. nucleatum is prevalent in colorectal cancer (CRC) patients [9–11] and (ii) its abundance is increased in new-onset Crohn’s disease (CD) subjects . Interestingly, follow-up studies suggested a potential role of this bacterium in CRC tumorigenesis and tumor-immune evasion [13–16].
Despite these findings, a large fraction of F. nucleatum gene products are still uncharacterized. Moreover, to date, only a handful of pathogenic factors has been experimentally identified [17, 18] and protein interaction data between these factors and human proteins, which could inform on the molecular details underlying host-cell subversion mechanisms, are sparse [4, 16, 19]. Altogether, this underlines that a comprehensive view of the molecular details of the F. nucleatum—human cross-talk is currently missing.
How could F. nucleatum hijack human cells? Pathogens employ a variety of molecular strategies to reach an advantageous niche for survival. One of them consists of subverting host protein interaction networks. Indeed, they secrete and deliver factors such as toxic compounds, small peptides, and even proteins to target the host molecular networks. To achieve this, virulence factors often display structures resembling host components in form and function [20–22] to interact with host proteins, thus providing a benefit to the pathogen . Such “molecular mimics” (e.g., targeting motifs, enzymatic activities, and protein–protein interaction elements) allow pathogens to enter the host cell and perturb cell pathways (e.g., [24–26]).
Over the years, several experimental approaches have been applied to identify protein–protein interactions (PPIs) between pathogens and their hosts providing new insights on the pathogen’s molecular invasion strategies. However, the vast majority of these systematic studies focused on viruses (e.g., [27–29]) and, to a lesser extent, on bacteria [30–33] and eukaryotic parasites [33, 34]. Indeed, as cellular pathogens have large genomes and complex life cycles, the experimental identification of virulence proteins and the large-scale mapping of host-pathogen PPIs require a lot of effort and time [35, 36]. In this context, computational approaches have proved to be instrumental for the identification of putative pathogenic proteins (e.g., [37, 38]), the characterization of molecular mimics [23, 39, 40], and the inference of their interactions with host proteins (for a review see ).
Prediction of F. nucleatum secreted proteome
Previous computational analyses highlighted that F. nucleatum has a reduced repertoire of secretion machinery [42, 43] meaning that it might exploit alternative “non-classical” translocation mechanisms to unleash virulence proteins. Thus, we sought to identify putative F. nucleatum secreted proteins by analyzing the 2046 protein sequences of the type species F. nucleatum subsp. nucleatum (strain ATCC 25586) proteome using two distinct algorithms: SignalP  for peptide-triggered secretion and SecretomeP  for leaderless protein secretion. While the SignalP algorithm predicted 61 F. nucleatum sequences being secreted via classical/regular secretion pathways, SecretomeP found 176 proteins as possibly secreted through non-classical routes. In total, we identified 237 putative secreted proteins in the F. nucleatum proteome (herein called “FusoSecretome”) (see Additional file 1: Table S1). Notably, we were able to correctly predict as secreted all the F. nucleatum virulence proteins known so far, namely FadA (FN0264), Fap2 (FN1449), RadD (FN1526), and the recently identified Aid1 adhesin (FN1253) . This result underlines the relevance of secretion prediction to identify novel putative virulence proteins in the F. nucleatum proteome.
Enrichment of Pfam domains in the FusoSecretome compared to non-secreted proteins
Corrected P valuef
MORN repeat variant
2.17 × 10−12
Outer membrane beta-barrel protein superfamily
2.4 × 10−7
Haemolysin secretion/activation protein ShlB/FhaC/HecB
Outer membrane beta-barrel protein superfamily
TonB-dependent Receptor Plug Domain
TonB dependent receptor
Outer membrane beta-barrel protein superfamily
Surface antigen variable number repeat
POTRA domain superfamily
YadA-like C-terminal region
Pectate lyase-like beta helix
Bacterial extracellular solute-binding proteins, family 5 Middle
Periplasmic binding protein clan
Coiled stalk of trimeric autotransporter adhesion
Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-bdg
Thiamin diphosphate-binding superfamily
Tetratrico peptide repeat superfamily
Inference of the FusoSecretome—human interaction network
Generally, pathogens employ a variety of molecular strategies to interfere with host-cell networks, controlling key functions such as plasma membrane and cytoskeleton dynamics, immune response, and cell death/survival. In particular, their proteins often carry a range of mimics, which resemble structures of the host at the molecular level, to “sneak” into host cells [20–22, 50].
Here, we focused on putative molecular mimicry events that can mediate the interaction with host proteins: (i) globular domains that occur in both FusoSecretome and the human proteome and (ii) known eukaryotic short linear motifs (SLiMs) found in FusoSecretome proteins. SLiMs are short stretches of 3–10 contiguous amino acids residues that often mediate transient PPIs and tend to bind with low affinity .
We first scanned the sequences of the FusoSecretome and human proteins for the presence of domains as defined by Pfam . We identified 55 “host-like” domains in 50 FusoSecretome proteins out of 237, including several domains related to ribosomal proteins, aminopeptidases, and tetratricopeptide repeats (TPR) (Additional file 4: Table S3). Interestingly, 29 of these domains are also found in known bacterial binders of human proteins .
We next detected the occurrence of experimentally identified SLiMs gathered from the Eukaryotic Linear Motif (ELM) database . As linear motifs are short and degenerate in sequence, SLiM detection is prone to over-prediction . To reduce the number of false positives, we kept occurrences falling in conserved and disordered protein sequences (see the “Methods” section). Indeed, known functional SLiMs show a higher degree of conservation compared to surrounding residues  and are located in unstructured regions [55, 56]. In this way, we identified at least one putative mimicry SLiM in 139 FusoSecretome proteins. Most of the 57 different detected SLiMs represents binding sites such as motifs recognized by PDZ, SH3, and SH2 domains (Additional file 4: Table S3).
We exploited these putative mimicry events to infer the interaction with human proteins by using templates of domain–domain and SLiM–domain interactions (see the “Methods” section for further details). Doing so, we obtained 3744 interactions (1544 domain- and 2201 SLiM-mediated interactions, respectively) between 144 FusoSecretome, which we designated as “candidate virulence proteins,” and 934 human proteins (Additional file 5: Table S4 and Additional file 6: Table S5) designated as “human inferred interactors.”
In order to assess the reliability of the inferences, we evaluated the biological relevance of the putative human interactors by performing enrichment analyses of different orthogonal datasets using as a reference background all the proteins encoded by the human genome.
First, human proteins experimentally identified as binders or targets of bacterial and viral proteins are over-represented among the 934 inferred human interactors of the FusoSecretome proteins (415 proteins, 1.3-fold, P value = 1.61 × 10−11). Notably, the over-representation holds when bacterial and viral binders are considered separately (176 bacterial interactors, 1.1-fold, P value = 3.5 × 10−3 and 338 viral interactors, 1.5-fold, P value <2.2 × 10−16). This result is consistent with current knowledge on convergent targeting of host proteins by distinct pathogens [30, 33, 57, 58]. Second, according to the Human Proteins Atlas (see the “Methods” section), the vast majority of the inferred human interactors has been detected either in small intestine (652, 70%) or colorectal (671 proteins, 72%) tissues as well as in the saliva (673, 72%), confirming their presence in human body sites hosting F. nucleatum. Third, we assessed whether the inferred human interactors are implicated in gastrointestinal disorders by seeking for an over-representation of genes associated to such diseases (see the “Methods” section). Indeed, the human interactors of the FusoSecretome are enriched in (i) proteins identified in the human colon secretomes of colorectal cancer (CRC) tissue samples (3.5-fold, P value <2.2 × 10−16), (ii) proteins encoded by genes whose expression correlates with F. nucleatum abundance in CRC patients  (twofold, P value = 4 × 10−4), and (iii) genes associated with inflammatory bowel diseases (IBDs) (twofold, P value = 8 × 10−4). We obtained very similar enrichments by using a reduced statistical background corresponding to the interaction inference space (see the “Methods” section and Additional file 7: Supplementary Results).
Altogether, the results of these analyses highlight the relevance of the inferred human interactors as putative binders of FusoSecretome proteins and their potential implication in gut diseases, therefore validating the undertaken inference approach.
Functional role of the human proteins targeted by F. nucleatum
Significant Gene Ontology and pathways annotations among FusoSecretome inferred human interactors
7.57 × 10−41
2.65 × 10−34
2.51 × 10−29
6.63 × 10−27
Protein activation cascade
1.17 × 10−25
5.45 × 10−23
SRP-dependent cotranslational protein targeting to membrane
1.93 × 10−22
3.62 × 10−21
Regulation of immune effector process
4.03 × 10−18
Regulation of complement activation
4.10 × 10−18
9.81 × 10−37
2.76 × 10−23
4.15 × 10−22
Side of membrane
2.46 × 10−18
2.85 × 10−17
9.20 × 10−17
4.28 × 10−12
Extrinsic component of plasma membrane
2.35 × 10−09
4.54 × 10−08
3.91 × 10−07
4.34 × 10−39
Complement and coagulation cascades
2.28 × 10−22
Cell adhesion molecules (CAMs)
2.88 × 10−09
Protein processing in endoplasmic reticulum
2.03 × 10−08
T cell receptor signaling pathway
1.33 × 10−07
Staphylococcus aureus infection
1.78 × 10−06
7.25 × 10−06
8.21 × 10−06
Epstein-Barr virus infection
1.11 × 10−05
1.77 × 10−05
Activation of Matrix Metalloproteinases
4.77 × 10−28
Viral mRNA Translation
2.14 × 10−19
Peptide chain elongation
2.14 × 10−19
Nonsense Mediated Decay independent of the Exon Junction Complex
1.83 × 10−17
Regulation of Complement cascade
5.13 × 10−16
Cell surface interactions at the vascular wall
7.22 × 10−16
3.61 × 10−14
Regulation of HSF1-mediated heat shock response
3.92 × 10−14
Defective HLCS causes multiple carboxylase deficiency
1.33 × 10−08
Nectin/Necl trans heterodimerization
1.33 × 10−08
F. nucleatum targets topologically important proteins in the host network
The human interactome is composed of functional network modules, defined as group of proteins densely connected through their interactions and involved in the same biological process  (see the “Methods” section). We thus next investigated the 855 functional modules that we previously detected  using the OCG algorithm that decomposes a network into overlapping modules, based on modularity optimization  (Additional file 10: Table S8). A significant number of interactors participate in 2 or more of these functional units (259 proteins, 1.3-fold enrichment, P value = 1.4 × 10−7), indicating that the FusoSecretome tends to target multifunctional proteins in the human interactome . Moreover, among the multifunctional inferred human interactors, we found an enrichment of extreme multifunctional proteins (52 interactors, twofold enrichment, P value = 1.0 × 10−5), which are defined as proteins involved in unrelated cellular functions and may represent candidate moonlighting proteins . This suggests that the FusoSecretome might perturb multiple cellular pathways simultaneously by targeting preferentially a whole range of multifunctional proteins.
Functional subnetworks of the human interactome perturbed by F. nucleatum and identification of the main candidate virulence proteins
Network module significantly enriched in inferred human interactors
Immune response-regulating cell surface receptor signaling pathway (GO:0002768), cell-cell junction (GO:0005911)
Metal ion homeostasis (GO:0055065), cell surface (GO:0009986)
Cellular response to organonitrogen compound (GO:0071417), membrane raft (GO:0045121)
Endocytosis (GO:0006897), membrane raft (GO:0045121)
Extracellular structure organization (GO:0043062), cell surface (GO:0009986)
Immune response-regulating cell surface receptor signaling pathway (GO:0002768), cell surface (GO:0009986)
Immune response-activating cell surface receptor signaling pathway (GO:0002429), nucleolar ribonuclease P complex (GO:0005655)
I-kappaB kinase/NF-kappaB cascade (GO:0007249), inclusion body (GO:0016234)
I-kappaB kinase/NF-kappaB cascade (GO:0007249), perinuclear region of cytoplasm (GO:0048471)
Neuron projection guidance (GO:0097485), synapse (GO:0045202)
G1/S transition of mitotic cell cycle (GO:0000082), cyclin-dependent protein kinase holoenzyme complex (GO:0000307)
Blood coagulation (GO:0007596), membrane raft (GO:0045121)
T cell activation (GO:0042110), Golgi membrane (GO:0000139)
Collagen catabolic process (GO:0030574), extracellular matrix (GO:0031012)
Actin cytoskeleton organization (GO:0030036), Arp2/3 protein complex (GO:0005885)
Stress-activated MAPK cascade (GO:0051403), nuclear speck (GO:0016607)
Actin filament organization (GO:0007015), lamellipodium (GO:0030027)
Positive regulation of intracellular protein kinase cascade (GO:0010740), spindle (GO:0005819)
Mitotic cell cycle phase transition (GO:0044772), heterochromatin (GO:0000792)
Regulation of system process (GO:0044057), dendrite (GO:0030425)
Regulation of sequence-specific DNA binding transcription factor activity (GO:0051090), external side of plasma membrane (GO:0009897)
Cell cycle phase transition (GO:0044770), transcription factor complex (GO:0005667)
Complement activation (GO:0006956), ER membrane insertion complex (GO:0072379)
Axonogenesis (GO:0007409), signalosome (GO:0008180)
Response to unfolded protein (GO:0006986), perinuclear region of cytoplasm (GO:0048471)
Regulation of sequence-specific DNA binding transcription factor activity (GO:0051090), chromatin (GO:0000785)
Blood coagulation (GO:0007596), apical junction complex (GO:0043296)
Peptidyl-tyrosine phosphorylation (GO:0018108), nucleolar ribonuclease P complex (GO:0005655)
Axon guidance (GO:0007411), cell leading edge (GO:0031252)
Gamma-aminobutyric acid signaling pathway (GO:0007214), postsynaptic membrane (GO:0045211)
Fc-gamma receptor signaling pathway involved in phagocytosis (GO:0038096), cell leading edge (GO:0031252
List of the main candidate virulence proteins in the FusoSecretome
Fusobacterium outer membrane protein family
LIG_FHA_1, LIG_FHA_2, LIG_PP1b, LIG_SH2_SRCb, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, MOD_N-GLC_1b, TRG_ENDOCYTIC_2b
LIG_FHA_1, LIG_FHA_2, LIG_Rb_pABgroove_1b, LIG_SH2_GRB2, LIG_SH2_SRCb, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_N-GLC_1b, MOD_PIKK_1, TRG_ENDOCYTIC_2b
Peptide methionine sulfoxide reductase MsrA
CLV_PCSK_PC1ET2_1, LIG_FHA_1, LIG_SH2_GRB2, LIG_SH2_SRCb, LIG_SH2_STAT5, MOD_Cter_Amidation, MOD_PIKK_1, TRG_ENDOCYTIC_2b
Hypothetical exported 24-amino acid repeat protein
POR_N, POR, EKR, Fer4_7, TPP_enzyme_C
LIG_CYCLIN_1b, LIG_SH2_GRB2, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_PIKK_1, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
Chaperone protein DnaJ
DnaJ, DnaJ_CXXCXGXG, CTDII
CLV_NDR_NDR_1, CLV_PCSK_SKI1_1, LIG_CYCLIN_1b, LIG_FHA_2, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, LIG_TRAF2_1b, LIG_WW_Pin1_4, MOD_CK2_1b, MOD_PLK, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
POR_N, POR, EKR, Fer4_7, TPP_enzyme_C
LIG_BRCT_BRCA1_1, LIG_SH2_GRB2, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
Fusobacterium outer membrane protein family
CLV_PCSK_SKI1_1, LIG_FHA_1, LIG_FHA_2, LIG_PDZ_Class_2, LIG_SH2_SRCb, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, MOD_GSK3_1, MOD_N-GLC_1b, MOD_PLK
Peptidase_S8a, Autotrns_rpta, Autotransportera
CLV_PCSK_SKI1_1, LIG_FHA_2, LIG_PDZ_Class_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_PKA_2, TRG_ENDOCYTIC_2b
Hypothetical cytosolic protein
MG1, A2M_N, A2M_N_2, A2M
A2M_N, A2M_N_2, A2M
LIG_CYCLIN_1b, LIG_FHA_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, MOD_CK2_1b, MOD_PIKK_1, MOD_PKA_2
DNAse I homologous protein DHP2
Biotin carboxyl carrier protein of glutaconyl-COA decarboxylase
LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_ProDKin_1, MOD_SUMOb
LIG_FHA_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_PKA_2
Single-stranded DNA-binding protein
LIG_BRCT_BRCA1_1, LIG_FHA_1, LIG_FHA_2, LIG_PDZ_Class_2, LIG_SUMO_SBM_1b, MOD_PKA_2, TRG_ENDOCYTIC_2b
CLV_PCSK_SKI1_1, LIG_BRCT_BRCA1_1, LIG_CYCLIN_1b, LIG_SH2_GRB2, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK2_1b, MOD_ProDKin_1
CLV_PCSK_PC1ET2_1, LIG_CYCLIN_1b, LIG_FHA_1, LIG_MAPK_1, LIG_PDZ_Class_2, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, MOD_GSK3_1, MOD_N-GLC_1b
LIG_PP1b, LIG_SH2_GRB2, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK1_1, MOD_ProDKin_1
Hemolysin activator protein
CLV_PCSK_PC1ET2_1, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, LIG_TRAF2_1b
Chaperone protein DnaK
CLV_PCSK_PC1ET2_1, CLV_PCSK_SKI1_1, LIG_BRCT_BRCA1_1, LIG_EVH1_1, LIG_FHA_2, LIG_SH2_STAT5, LIG_SH3_3b, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_PIKK_1, MOD_PLK, MOD_ProDKin_1, TRG_ENDOCYTIC_2b, TRG_LysEnd_APsAcLL_1b
Tetratricopeptide repeat protein
CLV_C14_Caspase3–7, LIG_FHA_1, LIG_FHA_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK2_1b, MOD_GSK3_1, MOD_PIKK_1, MOD_PLK, MOD_ProDKin_1
LIG_CYCLIN_1b, LIG_FHA_1, LIG_FHA_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK1_1, MOD_CK2_1b, MOD_GSK3_1, MOD_N-GLC_1b, MOD_PIKK_1, MOD_PLK, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
Tetratricopeptide repeat family protein
50S ribosomal protein L2
CLV_PCSK_SKI1_1, LIG_BIR_II_1, LIG_PP1b, LIG_SH3_3b, LIG_SUMO_SBM_1b, LIG_USP7_1b, LIG_WW_Pin1_4, MOD_PLK, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
LIG_TRAF2_1b, MOD_CK2_1 b
CLV_C14_Caspase3–7, LIG_FHA_1, LIG_FHA_2, LIG_SH2_STAT5, LIG_SUMO_SBM_1b, LIG_WW_Pin1_4, MOD_CK2_1b, MOD_GSK3_1, MOD_PIKK_1, MOD_PLK, MOD_ProDKin_1, TRG_ENDOCYTIC_2b
F. nucleatum and gut diseases from a network perspective
Three other modules enriched in inferred human interactors show a significant dysregulation of the expression of their constituent proteins during CRC progression  and are implicated in infection-response pathways and cytoskeleton organization (Fig. 5). In particular, two of these modules (Modules 138 and 216) show significant and specific upregulation in stage II, whereas the third (Module 371) is significantly upregulated in normal and stage II samples. Overall, these results indicate that F. nucleatum could contribute to the onset and progression of IBDs and CRC by perturbing some of the underlying network modules.
Comparison with additional bacterial strains
We applied our computational approach on the recently released proteomes of 6 actively invading Fusobacteria strains isolated from biopsy tissues [8, 65] (i.e., 4 F. nucleatum subspecies and 2 F. periodonticum strains), and the proteome of E. coli K-12 as a “control strain” (see Additional file 7: Supplementary Results, Table S12). We found that the secretomes of these 7 bacteria share common features (i.e., disorder propensity, enriched domains, host-like domain and mimicry SLiM content) with the FusoSecretome (Additional file 7: Table S13–S15 and Figure. S2–S8). However, we observed a moderate overlap in terms of inferred interactors, enriched functions and preferentially targeted network modules (Additional File 7: Table S16–S18), and a modest concordance in term of network module perturbators (Additional File 7: Table S19).
The results of these analyses suggest that, on the one hand, actively invading Fusobacteria species share common mechanisms to interact with host cell and, on the other hand, are consistent with the fact that F. nucleatum is an unusual heterogeneous species both at the genotypic and phenotypic level [8, 65, 67]. Finally, the commonalities between the FusoSecretome and the E. coli K-12 secreted proteins are not surprising, since previous work showed that E. coli K-12 carries cryptic genes coding for virulence factors , whose expression is activated by mutations in the histone-like protein HU, which convert this established commensal strain to an invasive species in intestinal cells .
Over the years, it has been shown that F. nucleatum can adhere and invade human cells triggering a pro-inflammatory response. Nevertheless, the current knowledge on the molecular players underlying the F. nucleatum—human cross-talk is still limited.
For this reason, we carried out a computational study to identify F. nucleatum putative secreted factors (FusoSecretome) that can interact with human proteins.
The originality of our study is manifold compared to previous work. First, we used secretion prediction to identify potential F. nucleatum proteins that can be present at the microbe–host interface. Second, we exploited both domain–domain and domain–motif templates to infer interactions with human proteins. Earlier works, including one on F. nucleatum, chiefly applied homology-based methods for interaction inference with host proteins (e.g., [70–73]). To our knowledge, domain–motif templates have been only exploited so far to infer or to resolve human–virus protein interaction networks [39, 74]. Indeed, SLiM mimicry is widespread among viruses [21, 75], but increasing evidence shows that it can be an effective subversion strategy in bacteria as well . Third, we performed a network-based analysis on the human interactome to identify the main candidate F. nucleatum virulence proteins and the sub-networks they likely perturb.
Our approach relies on two prediction steps: (i) the definition of the FusoSecretome based either on the presence of a signal peptide or several protein features such as disorder content, and (ii) the detection of host mimicry elements involved in the interaction with the host. It could be argued that the SecretomeP algorithm may incorrectly predict some proteins as secreted because of their high disorder content. For instance, a previous study considered as erroneous the secretion prediction of ribosomal proteins . We assigned 20 ribosomal proteins to the FusoSecretome. Although we cannot exclude a misprediction, ribosomal proteins can be secreted in some bacteria and be involved in host interaction [77, 78]. Furthermore, increasing evidence shows that ribosomal proteins are moonlighting proteins with extra-ribosomal functions such as the E. coli ribosomal L2 protein that moonlights by affecting the activity of replication proteins . Among the 337 inferred interactions between the 20 FusoSecretome ribosomal and 183 human proteins, only a third of latter belong to ribosomal protein families. Interestingly, only 3 of the 41 human interactors inferred for F. nucleatum L2 are ribosomal proteins, and we identified the L2 protein as candidate virulence protein preferentially targeting Module 451. As this module is mainly involved in cell cycle and DNA repair, this result is consistent with the ability of L2 in E. coli to interfere with DNA processing factors  and further reinforces the confidence in the secretome prediction. Moreover, we have here underlined the value of the proposed approach: the interactome provides, on the one hand, the proper biological context to filter out potential false positive inferred interactions and, on the other, pinpoints candidate proteins that can be involved in the F. nucleatum—host interface.
Concerning the host mimicry elements, SLiM detection is notorious for over-prediction , given their relative short length and degeneracy (i.e., few fixed amino acid positions). Our strategy to control for false positives was to consider only conserved SLiM occurrences in the FusoSecretome protein regions predicted as disordered. Indeed, the vast majority of known functional SLiMs falls in unstructured regions [54, 56] and shows higher levels of conservation compared to neighboring sequences. Conversely, we might also have missed some “true” mimicry instances in the FusoSecretome by using too stringent parameters for domains and SLiMs identification and our interaction inferences may well be incomplete due to the limited number of available interaction templates. However, their functional significance fortifies our confidence in the predictive approach. Indeed, the FusoSecretome shares similar features with known virulence proteins highlighting its pathogenic potential. In addition, interactors are implicated in established biological processes and cellular districts of the host–pathogen interface and significantly overlap with known pathogen protein binders. Furthermore, more than 70% of interactors are expressed in either the saliva or intestinal tissues. This suggests that most of the inferred interactions can occur in known F. nucleatum niches in the human body. Finally, we found among the human interactors an over-representation of genes whose expression correlates with F. nucleatum in CRC patients  as well as in IBD-related genes , which are mainly involved in immune- and infection-response pathways.
Moreover, we gained a broader view of the cellular functions that can be perturbed by the FusoSecretome by investigating the human interactome. Although our interactome contains some functional inherent biases typical of literature-based interaction networks  (see Additional file 7), it better covers the interactions space of human secreted proteins, which are not easy to investigate using large-scale interaction screening methods such as yeast-two hybrid .
In agreement with previous experimental observations of host cell networks targeted by distinct pathogens, F. nucleatum targets hubs and bottlenecks in the human interactome [30, 33, 57]. Interestingly, the FusoSecretome tend to interact with multifunctional proteins. This can represent an effective strategy to interfere with distinct cellular pathways as the same time .
Among the network modules preferentially targeted by the FusoSecretome, we identified, besides the well-established functions related to host—pathogen interactions, several modules involved in chromatin modification and transcription regulation (Modules 246, 451, 571, and 625), and localized in compartments such as perinuclear region of the cytoplasm (Modules 90, 138, and 615). Intriguingly, this is reminiscent of the fact that invading F. nucleatum strains localize in perinuclear district of colorectal adenocarcinoma cells  and that bacteria can tune host-cell response by interfering directly—or indirectly—with the chromatin organization and the regulation of gene expression .
We propose 26 FusoSecretome candidate virulence proteins as major network perturbators. They are the predominant interactors of preferentially targeted modules. Among the candidates, we identified the known virulence protein Fap2, which was recently shown to promote immune system evasion by interacting with the immunoreceptor TIGIT . Interestingly, Fap2 interacts specifically with Module 9, which is involved in immune response, thus suggesting novel potential binders mediating Fap2 subversion.
A recent report found that abundance of F. nucleatum is associated with high microsatellite instability tumors and shorter survival . Notably, three preferentially targeted network modules (i.e., Modules 138, 216, and 371) show a significant upregulation in a stage associated to high microsatellite instability during CRC progression (stage II) [66, 85] and poor prognosis [86, 87]. This suggests that these modules may be important for CRC progression and outcome and that the inferred interactions targeting these modules can mediate the cross-talk between F. nucleatum and the host in this particular subtype of CRC.
Overall, our functional and network-based analysis shows that the proposed interactions can occur in vivo and be biologically relevant for the F. nucleatum—human host dialog.
Over the last years, many microbes have been identified as key players in chronic disease onset and progression. However, untangling these complex microbe–disease associations requires lot effort and time, especially in the case of emerging pathogens that are often difficult to manipulate genetically. By detecting the presence of host mimicry elements, we have inferred the protein interactions between the putative secretome of F. nucleatum and human proteins, and ultimately provided a list of candidate virulence proteins and their human interactors that can be experimentally exploited to test new hypotheses on the F. nucleatum—host cross-talk. Our computational strategy can be helpful in guiding and speeding-up wet lab research in microbe–host interactions.
Protein sequence data
The reference proteomes of Fusobacterium nucleatum subsp. nucleatum strain ATCC 25586 (Proteome ID: UP000002521) and Homo sapiens (Proteome ID: UP000005640) were downloaded from the UniProtKB proteomes portal  (April 2013). The protein sequences of known gram-negative bacteria virulence factors were taken from the Virulence Factors DataBase  (January 2014).
We identified putative secreted proteins among the F. nucleatum proteins by applying two algorithms: SignalP 4.1  that detects the presence of a signal peptide and SecretomeP 2.0  that identifies non-classical secreted proteins (i.e., not triggered by a signal peptide) using a set of protein features such as amino acid composition and intrinsic disorder content.
To evaluate the intrinsic disorder propensity of F. nucleatum proteins predicted as secreted, we used the stand-alone programs of the following algorithms: DISOPRED (version 2.0) , IUPred (both long and short predictions)  and DisEMBL (COILS and HOTLOOPS predictions, version 1.4) . We compared the disorder propensity distribution of SignalP-predicted secreted proteins to non-secreted proteins using the Kolmogorov–Smirnov test (two-sided, alpha = 0.05).
Detection of functional domains
We ran the pfamscan program  on F.nucleatum, H. sapiens, and virulence factors protein sequences to detect the presence of Pfam domains  (release 26). We kept only Pfam-A matches with an E value <10−5.
Identification of short linear motifs
We used the SLiMSearch 2.0 tool from the SLiMSuite  to identify occurrences of known short linear motifs from the ELM database  (downloaded in May 2013) in the F. nucleatum proteome. To select putative mimicry motifs, we applied two SLIMSearch context filters: (i) the motif must be in a disordered region (average motif disorder score >0.2, calculated by IUPred) and (ii) must be conserved in at least one putative ortholog detected in a database of 694 proteomes of commensal/pathogen bacteria in Mammalia downloaded from UniprotKB (March 2014). Sequence alignments and conservation assessment were performed using the GOPHER program from the SLiMSuite using standard parameters .
Protein interaction inference
We built an interaction network between F. nucleatum putative secretome and human proteins by using interaction templates from the 3did database , which stores 6290 high-resolution three-dimensional templates for domain–domain interactions, and the iELM resource [96, 97] that lists 578 high-confidence motif-mediated interfaces between 191 ELM motifs and 402 human proteins. Both datasets were downloaded in August 2013. The domain-based interaction inference works as follow: given a pair of known interacting domains A and B, if domain A is detected in the F. nucleatum protein a and domain B in the human protein b, then an interaction between a and b is inferred. Analogously, for the SLiM-mediated interaction inference: for a given known ELM motif m interacting with the domain C in the human protein c, if the motif m occurs in the F. nucleatum protein a, then a is inferred to interact with c.
Human proteins targeted by bacteria and viruses
We gathered a list of 3428 human proteins that were experimentally identified as interaction partners of three bacterial pathogen proteins (Bacillus anthracis, Francisella tularensis, and Yersinia pestis) in a large-scale yeast two-hybrid screen . We downloaded interaction data with viruses for 4897 human proteins from the VirHostNet database .
Human expression data
RNA-seq expression data for 20,345 protein coding genes in normal colorectal, salivary gland and small intestine (i.e., jejunum and ileum) tissues was downloaded from the Human Protein Atlas (version 13), a compendium of gene and protein expression profiles in 32 tissues . We considered as expressed those protein-coding genes with a FPKM >1, that is 13,640 for colorectal, 13,742 for salivary gland and 13,220 for small intestine.
Functional enrichment analyses
We have compiled several gut-related disease gene sets gathering data from the literature and public repositories. Patient secretome profiling (2566 proteins) for tumor colorectal tissue samples were taken from . We retrieve 152 colorectal cancer genes from the Network of Cancer Genes database (version 4.0, ). The list of human genes whose expression correlates with F. nucleatum abundance in colorectal cancer patients  was kindly provided by Aleksandar Kostic (Broad Institute, USA). The compendium of 163 loci associated with inflammatory bowel diseases was taken from a large meta-analysis of Crohn’s disease and ulcerative colitis genome-wide association studies . The enrichment of these gut-related disease gene sets among inferred interactors was tested using a one-sided Fisher’s exact test.
We assessed the over-representation of cellular functions by performing a enrichment analysis on the list of inferred human interactors using the g:Profiler webserver  (version: r1488_e83_eg30, build date: December 2015). We analyzed the following annotations: Biological Process and Cellular Component from the Gene Ontology ; biological pathways from KEGG , and Reactome . Functional categories containing less than 5 and more than 500 genes were discarded.
We used two different reference backgrounds for these statistical analyses. The first background consists of the protein-coding genes in the human genome (i.e., 20′254 genes, UniprotKB, February 2013), whereas the second includes 11′284 protein-coding genes for which we could infer an interaction based on the available domain–domain and motif–domain interaction templates. In both cases, P values were corrected for multiple testing with the Benjamini–Hochberg procedure applying a significance threshold equal to 0.025.
Human interactome building, network module detection and annotation
We use the human interactome that we assembled and used in [62, 66]. Briefly, protein interaction data were gathered from several databases (e.g., BioGRID, InnateDB, Intact, MatrixDB, MINT, Reactome) through the PSICQUIC query interface  and from large-scale interaction mapping experiments (e.g., ). We kept only likely direct (i.e., binary) interactions according to the experimental detection method  and mapped protein identifiers to UniprotKB IDs. Given the redundancy among SwissProt and TrEMBL entries, protein sequences were clustered using the CD-HIT algorithm . SwissProt/TrEMBL pairs at 95% identity were considered as the same protein: interactions of TrEMBL protein were assigned to the SwissProt protein. As a result, we obtained a human binary interactome containing 74,388 interactions between 12,865 proteins (February 2013).
We detected 855 network modules detected using the Overlapping Cluster Generator algorithm . Modules were functionally annotated by assessing the enrichment of Gene Ontology (GO) biological process and cellular component terms , and cellular pathways from KEGG  and Reactome . Enrichment P values were computed using the R package gProfileR  and corrected for multiple testing with the Benjamini–Hochberg procedure (significance threshold = 0.025) and annotated proteins in the human interactome were used as statistical background. Similarly, the over-representation of inferred human interactors and gut disease gene sets in network modules of the human interactome was assessed using a one-sided Fisher’s exact test followed by Benjamini–Hochberg multiple testing correction (significance threshold = 0.025).
Network module perturbation Z score
Where x f , m is the number of inferred interactions of the protein f with module m, Z f,m is the perturbation Z score of the protein f in the module m, μm, and σ m are the mean of the inferred interaction values and their standard deviation in the module m, respectively.
Network modules significantly dysregulated during CRC progression
The 77 network modules showing a significant dysregulation during CRC progression were taken from our previous work , in which we devised a computational method that combines quantitative proteomic profiling of TCGA CRC samples, protein interaction network, and statistical analysis to identify significantly dysregulated cellular functions during cancer progression.
The authors would like to thank the members of the TAGC laboratory for fruitful discussion, Anaïs Baudot (I2M, CNRS, France) for critically reading the first draft of the manuscript, Aleksandar Kostic (Broad Institute, USA) for kindly providing the list of human genes whose expression correlates with F. nucleatum abundance in colorectal cancer patients, and Henrik Nielsen (DTU Bioinformatics, Denmark) for assistance in running SecretomeP predictions. AZ is grateful to Coralie, Olivia, and Claire for their constant support.
The project leading to this publication has received funding from Excellence Initiative of Aix-Marseille University—A*MIDEX, a French “Investissements d’Avenir” program, to CB, and was partially supported by the French ‘Plan Cancer 2009–2013’ program (Systems Biology call, A12171AS). The funding organizations had no role in the design of the study and collection, analysis, interpretation of data, and in writing the manuscript.
Availability of data and materials
All data generated or analyzed on the ATCC 25586 strain are included in this published article (and its Additional files). All other data are available from the corresponding author on reasonable request.
AZ conceived the study, designed and performed the experiments, analyzed the data, and wrote the manuscript. LS performed the experiments and analyzed the data. SB performed the experiments. CB designed the experiments, analyzed the data, and wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
This study is based on publicly available datasets only. Thus, no ethical approval is needed/applicable nor is consent from any participants.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Moore WE, Moore LV. The bacteria of periodontal diseases. Periodontol. 1994;5:66–77.View ArticleGoogle Scholar
- Han YW. Fusobacterium nucleatum: a commensal-turned pathogen. Curr Opin Microbiol. 2015;23C:141–7.View ArticleGoogle Scholar
- Dharmani P, Strauss J, Ambrose C, Allen-Vercoe E, Chadee K. Fusobacterium nucleatum infection of colonic cells stimulates MUC2 mucin and tumor necrosis factor alpha. Infect Immun. 2011;79:2597–607.PubMedPubMed CentralView ArticleGoogle Scholar
- Fardini Y, Wang X, Témoin S, Nithianantham S, Lee D, Shoham M, et al. Fusobacterium nucleatum adhesin FadA binds vascular endothelial cadherin and alters endothelial integrity. Mol Microbiol. 2011;82:1468–80.PubMedPubMed CentralView ArticleGoogle Scholar
- Gursoy UK, Könönen E, Uitto V-J. Intracellular replication of fusobacteria requires new actin filament formation of epithelial cells. APMIS Acta Pathol Microbiol Immunol Scand. 2008;116:1063–70.View ArticleGoogle Scholar
- Han YW, Shi W, Huang GT, Kinder Haake S, Park NH, Kuramitsu H, et al. Interactions between periodontal bacteria and human oral epithelial cells: Fusobacterium nucleatum adheres to and invades epithelial cells. Infect Immun. 2000;68:3140–6.PubMedPubMed CentralView ArticleGoogle Scholar
- Quah SY, Bergenholtz G, Tan KS. Fusobacterium nucleatum induces cytokine production through Toll-like-receptor-independent mechanism. Int Endod J. 2014;47:550–9.PubMedView ArticleGoogle Scholar
- Strauss J, Kaplan GG, Beck PL, Rioux K, Panaccione R, Devinney R, et al. Invasive potential of gut mucosa-derived Fusobacterium nucleatum positively correlates with IBD status of the host. Inflamm Bowel Dis. 2011;17:1971–8.PubMedView ArticleGoogle Scholar
- Castellarin M, Warren RL, Freeman JD, Dreolini L, Krzywinski M, Strauss J, et al. Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Res. 2012;22:299–306.PubMedPubMed CentralView ArticleGoogle Scholar
- Kostic AD, Gevers D, Pedamallu CS, Michaud M, Duke F, Earl AM, et al. Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. Genome Res. 2012;22:292–8.PubMedPubMed CentralView ArticleGoogle Scholar
- McCoy AN, Araújo-Pérez F, Azcárate-Peril A, Yeh JJ, Sandler RS, Keku TO. Fusobacterium is associated with colorectal adenomas. PLoS One. 2013;8:e53653.PubMedPubMed CentralView ArticleGoogle Scholar
- Gevers D, Kugathasan S, Denson LA, Vázquez-Baeza Y, Van Treuren W, Ren B, et al. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe. 2014;15:382–92.PubMedPubMed CentralView ArticleGoogle Scholar
- Kostic AD, Chun E, Robertson L, Glickman JN, Gallini CA, Michaud M, et al. Fusobacterium nucleatum potentiates intestinal tumorigenesis and modulates the tumor-immune microenvironment. Cell Host Microbe. 2013;14:207–15.PubMedPubMed CentralView ArticleGoogle Scholar
- Mima K, Nishihara R, Qian ZR, Cao Y, Sukawa Y, Nowak JA, et al. Fusobacterium nucleatum in colorectal carcinoma tissue and patient prognosis. Gut. 2016;65(12):1973–1980.Google Scholar
- Mima K, Sukawa Y, Nishihara R, Qian ZR, Yamauchi M, Inamura K, et al. Fusobacterium nucleatum and T cells in colorectal carcinoma. JAMA Oncol. 2015;1:653–61.PubMedPubMed CentralView ArticleGoogle Scholar
- Rubinstein MR, Wang X, Liu W, Hao Y, Cai G, Han YW. Fusobacterium nucleatum promotes colorectal carcinogenesis by modulating E-cadherin/β-catenin signaling via its FadA adhesin. Cell Host Microbe. 2013;14:195–206.PubMedPubMed CentralView ArticleGoogle Scholar
- Han YW, Ikegami A, Rajanna C, Kawsar HI, Zhou Y, Li M, et al. Identification and characterization of a novel adhesin unique to oral fusobacteria. J Bacteriol. 2005;187:5330–40.PubMedPubMed CentralView ArticleGoogle Scholar
- Kaplan CW, Ma X, Paranjpe A, Jewett A, Lux R, Kinder-Haake S, et al. Fusobacterium nucleatum outer membrane proteins Fap2 and RadD induce cell death in human lymphocytes. Infect Immun. 2010;78:4773–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Gur C, Ibrahim Y, Isaacson B, Yamin R, Abed J, Gamliel M, et al. Binding of the Fap2 protein of Fusobacterium nucleatum to human inhibitory receptor TIGIT protects tumors from immune cell attack. Immunity. 2015;42:344–55.PubMedPubMed CentralView ArticleGoogle Scholar
- Elde NC, Malik HS. The evolutionary conundrum of pathogen mimicry. Nat Rev Microbiol. 2009;7:787–97.PubMedView ArticleGoogle Scholar
- Davey NE, Travé G, Gibson TJ. How viruses hijack cell regulation. Trends Biochem Sci. 2011;36:159–69.PubMedView ArticleGoogle Scholar
- Via A, Uyar B, Brun C, Zanzoni A. How pathogens use linear motifs to perturb host cell networks. Trends Biochem Sci. 2015;40:36–48.PubMedView ArticleGoogle Scholar
- Ludin P, Nilsson D, Mäser P. Genome-wide identification of molecular mimicry candidates in parasites. PLoS One. 2011;6:e17546.PubMedPubMed CentralView ArticleGoogle Scholar
- Baxt LA, Garza-Mayers AC, Goldberg MB. Bacterial subversion of host innate immune pathways. Science. 2013;340:697–701.PubMedView ArticleGoogle Scholar
- Rudel T, Kepp O, Kozjak-Pavlovic V. Interactions between bacterial pathogens and mitochondrial cell death pathways. Nat Rev Microbiol. 2010;8:693–705.PubMedView ArticleGoogle Scholar
- Haglund CM, Welch MD. Pathogens and polymers: microbe-host interactions illuminate the cytoskeleton. J Cell Biol. 2011;195:7–17.PubMedPubMed CentralView ArticleGoogle Scholar
- Uetz P, Dong Y-A, Zeretzke C, Atzler C, Baiker A, Berger B, et al. Herpesviral protein networks and their interaction with the human proteome. Science. 2006;311:239–42.PubMedView ArticleGoogle Scholar
- de Chassey B, Navratil V, Tafforeau L, Hiet MS, Aublin-Gex A, Agaugué S, et al. Hepatitis C virus infection protein network. Mol Syst Biol. 2008;4:230.PubMedPubMed CentralView ArticleGoogle Scholar
- Jäger S, Cimermancic P, Gulbahce N, Johnson JR, McGovern KE, Clarke SC, et al. Global landscape of HIV-human protein complexes. Nature. 2012;481:365–70.Google Scholar
- Dyer MD, Neff C, Dufford M, Rivera CG, Shattuck D, Bassaganya-Riera J, et al. The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis, and Yersinia pestis. PLoS One. 2010;5:e12089.PubMedPubMed CentralView ArticleGoogle Scholar
- Blasche S, Arens S, Ceol A, Siszler G, Schmidt MA, Häuser R, et al. The EHEC-host interactome reveals novel targets for the translocated intimin receptor. Sci Rep. 2014;4:7531.PubMedPubMed CentralView ArticleGoogle Scholar
- Mirrashidi KM, Elwell CA, Verschueren E, Johnson JR, Frando A, Von Dollen J, et al. Global mapping of the inc-human interactome reveals that retromer restricts chlamydia infection. Cell Host Microbe. 2015;18:109–21.PubMedPubMed CentralView ArticleGoogle Scholar
- Weßling R, Epple P, Altmann S, He Y, Yang L, Henz SR, et al. Convergent targeting of a common host protein-network by pathogen effectors from three kingdoms of life. Cell Host Microbe. 2014;16:364–75.PubMedPubMed CentralView ArticleGoogle Scholar
- Ahn H-J, Kim S, Kim H-E, Nam H-W. Interactions between secreted GRA proteins and host cell proteins across the paratitophorous vacuolar membrane in the parasitism of Toxoplasma gondii. Korean J Parasitol. 2006;44:303–12.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu H-J, Wang AH-J, Jennings MP. Discovery of virulence factors of pathogenic bacteria. Curr Opin Chem Biol. 2008;12:93–101.PubMedView ArticleGoogle Scholar
- Vidal M, Cusick ME, Barabási A-L. Interactome networks and human disease. Cell. 2011;144:986–98.PubMedPubMed CentralView ArticleGoogle Scholar
- McDermott JE, Corrigan A, Peterson E, Oehmen C, Niemann G, Cambronne ED, et al. Computational prediction of type III and IV secreted effectors in gram-negative bacteria. Infect Immun. 2011;79:23–32.PubMedView ArticleGoogle Scholar
- Wang Y, Wei X, Bao H, Liu S-L. Prediction of bacterial type IV secreted effectors by C-terminal features. BMC Genomics. 2014;15:50.PubMedPubMed CentralView ArticleGoogle Scholar
- Garamszegi S, Franzosa EA, Xia Y. Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks. PLoS Pathog. 2013;9:e1003778.PubMedPubMed CentralView ArticleGoogle Scholar
- Ruhanen H, Hurley D, Ghosh A, O’Brien KT, Johnston CR, Shields DC. Potential of known and short prokaryotic protein motifs as a basis for novel peptide-based antibacterial therapeutics: a computational survey. Front Microbiol. 2014;5:4.PubMedPubMed CentralView ArticleGoogle Scholar
- Arnold R, Boonen K, Sun MGF, Kim PM. Computational analysis of interactomes: current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space. Methods. 2012;57:508–18.PubMedView ArticleGoogle Scholar
- Kapatral V, Anderson I, Ivanova N, Reznik G, Los T, Lykidis A, et al. Genome sequence and analysis of the oral bacterium Fusobacterium nucleatum strain ATCC 25586. J Bacteriol. 2002;184:2005–18.PubMedPubMed CentralView ArticleGoogle Scholar
- Desvaux M, Khan A, Beatson SA, Scott-Tucker A, Henderson IR. Protein secretion systems in Fusobacterium nucleatum: genomic identification of Type 4 piliation and complete type V pathways brings new insight into mechanisms of pathogenesis. Biochim Biophys Acta. 2005;1713:92–112.PubMedView ArticleGoogle Scholar
- Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8:785–6.PubMedView ArticleGoogle Scholar
- Bendtsen JD, Kiemer L, Fausbøll A, Brunak S. Non-classical protein secretion in bacteria. BMC Microbiol. 2005;5:58.PubMedPubMed CentralView ArticleGoogle Scholar
- Kaplan A, Kaplan CW, He X, McHardy I, Shi W, Lux R. Characterization of aid1, a novel gene involved in Fusobacterium nucleatum interspecies interactions. Microb Ecol. 2014;68:379–87.PubMedPubMed CentralView ArticleGoogle Scholar
- Marín M, Uversky VN, Ott T. Intrinsic disorder in pathogen effectors: protein flexibility as an evolutionary hallmark in a molecular arms race. Plant Cell. 2013;25:3153–7.PubMedPubMed CentralView ArticleGoogle Scholar
- Xue B, Blocquel D, Habchi J, Uversky AV, Kurgan L, Uversky VN, et al. Structural disorder in viral proteins. Chem Rev. 2014;114:6880–911.PubMedView ArticleGoogle Scholar
- Chen L, Xiong Z, Sun L, Yang J, Jin Q. VFDB 2012 update: toward the genetic diversity and molecular evolution of bacterial virulence factors. Nucleic Acids Res. 2012;40:D641–5.PubMedView ArticleGoogle Scholar
- Dean P. Functional domains and motifs of bacterial type III effector proteins and their roles in infection. FEMS Microbiol Rev. 2011;35:1100–25.PubMedView ArticleGoogle Scholar
- Davey NE, Van Roey K, Weatheritt RJ, Toedt G, Uyar B, Altenberg B, et al. Attributes of short linear motifs. Mol BioSyst. 2012;8:268–81.PubMedView ArticleGoogle Scholar
- Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40:D290–301.PubMedView ArticleGoogle Scholar
- Dinkel H, Michael S, Weatheritt RJ, Davey NE, Van Roey K, Altenberg B, et al. ELM–the database of eukaryotic linear motifs. Nucleic Acids Res. 2012;40:D242–51.PubMedView ArticleGoogle Scholar
- Edwards RJ, Palopoli N. Computational prediction of short linear motifs from protein sequences. Methods Mol Biol. 2015;1268:89–141.PubMedView ArticleGoogle Scholar
- Fuxreiter M, Tompa P, Simon I. Local structural disorder imparts plasticity on linear motifs. Bioinformatics. 2007;23:950–6.PubMedView ArticleGoogle Scholar
- Edwards RJ, Davey NE, O’Brien K, Shields DC. Interactome-wide prediction of short, disordered protein interaction motifs in humans. Mol BioSyst. 2012;8:282–95.PubMedView ArticleGoogle Scholar
- Mukhtar MS, Carvunis A-R, Dreze M, Epple P, Steinbrenner J, Moore J, et al. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science. 2011;333:596–601.PubMedPubMed CentralView ArticleGoogle Scholar
- Durmuş Tekir S, Cakir T, Ulgen KÖ. Infection strategies of bacterial and viral pathogens through pathogen-human protein-protein interactions. Front Microbiol. 2012;3:46.PubMedPubMed CentralView ArticleGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012;40:D109–14.PubMedView ArticleGoogle Scholar
- Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 2011;39:D691–7.PubMedView ArticleGoogle Scholar
- Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402:C47–52.PubMedView ArticleGoogle Scholar
- Chapple CE, Robisson B, Spinelli L, Guien C, Becker E, Brun C. Extreme multifunctional proteins identified from a human protein interaction network. Nat Commun. 2015;6:7412.PubMedPubMed CentralView ArticleGoogle Scholar
- Becker E, Robisson B, Chapple CE, Guénoche A, Brun C. Multifunctional proteins revealed by overlapping clustering in protein interaction network. Bioinformatics. 2012;28:84–90.PubMedView ArticleGoogle Scholar
- Chapple CE, Brun C. Redefining protein moonlighting. Oncotarget. 2015;6:16812–3.PubMedPubMed CentralView ArticleGoogle Scholar
- Manson McGuire A, Cochrane K, Griggs AD, Haas BJ, Abeel T, Zeng Q, et al. Evolution of invasion in a diverse set of Fusobacterium species. mBio. 2014;5:e01864.PubMedPubMed CentralView ArticleGoogle Scholar
- Zanzoni A, Brun C. Integration of quantitative proteomics data and interaction networks: Identification of dysregulated cellular functions during cancer progression. Methods. 2016;93:103–9.PubMedView ArticleGoogle Scholar
- Strauss J, White A, Ambrose C, McDonald J, Allen-Vercoe E. Phenotypic and genotypic analyses of clinical Fusobacterium nucleatum and Fusobacterium periodonticum isolates from the human gut. Anaerobe. 2008;14:301–9.PubMedView ArticleGoogle Scholar
- Kar S, Edgar R, Adhya S. Nucleoid remodeling by an altered HU protein: Reorganization of the transcription program. Proc Natl Acad Sci U S A. 2005;102:16397–402.PubMedPubMed CentralView ArticleGoogle Scholar
- Koli P, Sudan S, Fitzgerald D, Adhya S, Kar S. Conversion of commensal Escherichia coli K-12 to an invasive form via expression of a mutant histone-like protein. MBio. 2011;2(5). doi:https://doi.org/10.1128/mBio.00182-11.
- Tyagi N, Krishnadev O, Srinivasan N. Prediction of protein-protein interactions between Helicobacter pylori and a human host. Mol BioSyst. 2009;5:1630–5.PubMedView ArticleGoogle Scholar
- Doolittle JM, Gomez SM. Mapping protein interactions between Dengue virus and its human and insect hosts. PLoS Negl Trop Dis. 2011;5:e954.PubMedPubMed CentralView ArticleGoogle Scholar
- Schleker S, Garcia-Garcia J, Klein-Seetharaman J, Oliva B. Prediction and comparison of Salmonella-human and Salmonella-Arabidopsis interactomes. Chem Biodivers. 2012;9:991–1018.PubMedPubMed CentralView ArticleGoogle Scholar
- Kumar A, Thotakura PL, Tiwary BK, Krishna R. Target identification in Fusobacterium nucleatum by subtractive genomics approach and enrichment analysis of host-pathogen protein-protein interactions. BMC Microbiol. 2016;16:84.PubMedPubMed CentralView ArticleGoogle Scholar
- Evans P, Dampier W, Ungar L, Tozeren A. Prediction of HIV-1 virus-host protein interactions using virus and host sequence motifs. BMC Med Genet. 2009;2:27.Google Scholar
- Hagai T, Azia A, Babu MM, Andino R. Use of host-like peptide motifs in viral proteins is a prevalent strategy in host-virus interactions. Cell Rep. 2014;7:1729–39.PubMedPubMed CentralView ArticleGoogle Scholar
- Christie-Oleza JA, Piña-Villalonga JM, Bosch R, Nogales B, Armengaud J. Comparative proteogenomics of twelve Roseobacter exoproteomes reveals different adaptive strategies among these marine bacteria. Mol Cell Proteomics. 2012;11:M111.013110.PubMedView ArticleGoogle Scholar
- Tjalsma H, Lambooy L, Hermans PW, Swinkels DW. Shedding & shaving: disclosure of proteomic expressions on a bacterial face. Proteomics. 2008;8:1415–28.PubMedView ArticleGoogle Scholar
- Pérez-Cruz C, Delgado L, López-Iglesias C, Mercade E. Outer-inner membrane vesicles naturally secreted by gram-negative pathogenic bacteria. PLoS One. 2015;10:e0116896.PubMedPubMed CentralView ArticleGoogle Scholar
- Chodavarapu S, Felczak MM, Kaguni JM. Two forms of ribosomal protein L2 of Escherichia coli that inhibit DnaA in DNA replication. Nucleic Acids Res. 2011;39:4180–91.PubMedPubMed CentralView ArticleGoogle Scholar
- Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491:119–24.PubMedPubMed CentralView ArticleGoogle Scholar
- Cusick ME, Yu H, Smolyar A, Venkatesan K, Carvunis A-R, Simonis N, et al. Literature-curated protein interaction datasets. Nat Methods. 2009;6:39–46.PubMedPubMed CentralView ArticleGoogle Scholar
- Koegl M, Uetz P. Improving yeast two-hybrid screening systems. Brief Funct Genomic Proteomic. 2007;6:302–12.PubMedView ArticleGoogle Scholar
- Navratil V, de Chassey B, Combe CR, Lotteau V. When the human viral infectome and diseasome networks collide: towards a systems biology platform for the aetiology of human diseases. BMC Syst Biol. 2011;5:13.PubMedPubMed CentralView ArticleGoogle Scholar
- Bierne H, Hamon M, Cossart P. Epigenetics and bacterial infections. Cold Spring Harb Perspect Med. 2012;2:a010272.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang B, Wang J, Wang X, Zhu J, Liu Q, Shi Z, et al. Proteogenomic characterization of human colon and rectal cancer. Nature. 2014;513:382–7.PubMedPubMed CentralView ArticleGoogle Scholar
- Sadanandam A, Lyssiotis CA, Homicsko K, Collisson EA, Gibb WJ, Wullschleger S, et al. A colorectal cancer classification system that associates cellular phenotype and responses to therapy. Nat Med. 2013;19:619–25.PubMedPubMed CentralView ArticleGoogle Scholar
- Desousaemelo F, Wang X, Jansen M, Fessler E, Trinh A, de Rooij LPMH, et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat Med 2013;19:614–618.Google Scholar
- UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 2012;40:D71–5.View ArticleGoogle Scholar
- Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT. The DISOPRED server for the prediction of protein disorder. Bioinformatics. 2004;20:2138–9.PubMedView ArticleGoogle Scholar
- Dosztányi Z, Csizmok V, Tompa P, Simon I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21:3433–4.PubMedView ArticleGoogle Scholar
- Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB. Protein disorder prediction: implications for structural proteomics. Structure. 2003;11:1453–9.PubMedView ArticleGoogle Scholar
- Li W, Cowley A, Uludag M, Gur T, McWilliam H, Squizzato S, et al. The EMBL-EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res. 2015;43:W580–4.PubMedPubMed CentralView ArticleGoogle Scholar
- Davey NE, Haslam NJ, Shields DC, Edwards RJ. SLiMSearch 2.0: biological context for short linear motifs in proteins. Nucleic Acids Res. 2011;39:W56–60.PubMedPubMed CentralView ArticleGoogle Scholar
- Davey NE, Shields DC, Edwards RJ. SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent. Nucleic Acids Res. 2006;34:3546–54.PubMedPubMed CentralView ArticleGoogle Scholar
- Stein A, Céol A, Aloy P. 3did: identification and classification of domain-based interactions of known three-dimensional structure. Nucleic Acids Res. 2011;39:D718–23.PubMedView ArticleGoogle Scholar
- Weatheritt RJ, Luck K, Petsalaki E, Davey NE, Gibson TJ. The identification of short linear motif-mediated interfaces within the human interactome. Bioinforma Oxf Engl. 2012;28:976–82.View ArticleGoogle Scholar
- Weatheritt RJ, Jehl P, Dinkel H, Gibson TJ. iELM—a web server to explore short linear motif-mediated interactions. Nucleic Acids Res. 2012;40:W364–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Navratil V, de Chassey B, Meyniel L, Delmotte S, Gautier C, André P, et al. VirHostNet: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks. Nucleic Acids Res. 2009;37:D661–8.PubMedView ArticleGoogle Scholar
- Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Tissue-based map of the human proteome. Science. 2015;347:1260419.PubMedView ArticleGoogle Scholar
- de Wit M, Kant H, Piersma SR, Pham TV, Mongera S, van Berkel MPA, et al. Colorectal cancer candidate biomarkers identified by tissue secretome proteome profiling. J Proteome. 2014;99:26–39.View ArticleGoogle Scholar
- An O, Pendino V, D’Antonio M, Ratti E, Gentilini M, Ciccarelli FD. NCG 4.0: the network of cancer genes in the era of massive mutational screenings of cancer genomes. Database J Biol Databases Curation. 2014;2014:bau015.Google Scholar
- Reimand J, Arak T, Adler P, Kolberg L, Reisberg S, Peterson H, Vilo J. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res. 2016;44(W1):W83–9.Google Scholar
- The Gene Ontology Consortium. The gene ontology in 2010: extensions and refinements. Nucleic Acids Res. 2010;38:D331–5.View ArticleGoogle Scholar
- Aranda B, Blankenburg H, Kerrien S, Brinkman FSL, Ceol A, Chautard E, et al. PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods. 2011;8:528–9.PubMedPubMed CentralView ArticleGoogle Scholar
- Rolland T, Taşan M, Charloteaux B, Pevzner SJ, Zhong Q, Sahni N, et al. A proteome-scale map of the human interactome network. Cell. 2014;159:1212–26.PubMedPubMed CentralView ArticleGoogle Scholar
- Rual J-F, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005;437:1173–8.PubMedView ArticleGoogle Scholar
- Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–2.PubMedPubMed CentralView ArticleGoogle Scholar