One method for rejuvenating land polluted with anthropogenic contaminants is through phytoremediation, the reclamation of land through the cultivation of specific crops. The capacity for phytoremediation crops, such as Salix spp., to tolerate and even flourish in contaminated soils relies on a highly complex and predominantly cryptic interacting community of microbial life.
Here, Illumina HiSeq 2500 sequencing and de novo transcriptome assembly were used to observe gene expression in washed Salix purpurea cv. ‘Fish Creek’ roots from trees pot grown in petroleum hydrocarbon-contaminated or non-contaminated soil. All 189,849 assembled contigs were annotated without a priori assumption as to sequence origin and differential expression was assessed.
The 839 contigs differentially expressed (DE) and annotated from S. purpurea revealed substantial increases in transcripts encoding abiotic stress response equipment, such as glutathione S-transferases, in roots of contaminated trees as well as the hallmarks of fungal interaction, such as SWEET2 (Sugars Will Eventually Be Exported Transporter). A total of 8252 DE transcripts were fungal in origin, with contamination conditions resulting in a community shift from Ascomycota to Basidiomycota genera. In response to contamination, 1745 Basidiomycota transcripts increased in abundance (the majority uniquely expressed in contaminated soil) including major monosaccharide transporter MST1, primary cell wall and lamella CAZy enzymes, and an ectomycorrhiza-upregulated exo-β-1,3-glucanase (GH5). Additionally, 639 DE polycistronic transcripts from an uncharacterised Enterobacteriaceae species were uniformly in higher abundance in contamination conditions and comprised a wide spectrum of genes cryptic under laboratory conditions but considered putatively involved in eukaryotic interaction, biofilm formation and dioxygenase hydrocarbon degradation.
Fungal gene expression, representing the majority of contigs assembled, suggests out-competition of white rot Ascomycota genera (dominated by Pyronema), a sometimes ectomycorrhizal (ECM) Ascomycota (Tuber) and ECM Basidiomycota (Hebeloma) by a poorly characterised putative ECM Basidiomycota due to contamination. Root and fungal expression involved transcripts encoding carbohydrate/amino acid (C/N) dialogue whereas bacterial gene expression included the apparatus necessary for biofilm interaction and direct reduction of contamination stress, a potential bacterial currency for a role in tripartite mutualism. Unmistakable within the metatranscriptome is the degree to which the landscape of rhizospheric biology, particularly the important but predominantly uncharacterised fungal genetics, is yet to be discovered.
The observation of gene expression across multiple interacting organisms has the potential to better reflect the complex reality of biology than the observation of organisms in isolation . By separating the assembly of RNA sequence data from annotation (identification) of assembled contigs, de novo metatranscriptome assembly allows for such observation without a prerequisite for, and therefore bias from, reference genome sequences from organisms expected to be present within any biological system [2, 3]. A metatranscriptomic approach designed without constraint to any a priori defined organism, but open to annotation from any sequenced strata of life, should be powerful in biological systems already recognised as highly complex, such as the human digestive tract or rhizosphere microbiome (although such microbiome complexity could arguably be defined by the current extent of study in a biological field). Here, the rhizospheric microbiome of Salix purpurea cv. ‘Fish Creek’ was challenged using hydrocarbon-contaminated soil and differential gene expression observed.
Pervasive organic pollutants, such as polycyclic aromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs) and C10-C50 petroleum hydrocarbons, represent serious risk to human health and the environment . There are thought to be greater than 400,000 contaminated sites across North America [5, 6] and estimates are as high as 2.5 million sites across the EU . Currently, rehabilitation of such sites is constrained by the high costs (> 2 M$/ha, ) of standard restoration strategies such as excavation and transport (dig-and-dump), with the consequence that these sites are rarely restored. In recent decades, a consensus has grown that environmentally sustainable and economically viable land restoration methods should be developed; phytoremediation is one such green technology alternative [9, 10]. Phytoremediation relies on the interaction between plants and their associated microorganisms to absorb, immobilise, volatilise, degrade, translocate or transform organic and inorganic contaminants . In trials across Canada, a wide range of fast growing short rotation coppice willow cultivars (Salix spp.) have been shown as highly tolerant to PAH, PCBs and other organic petroleum hydrocarbon contaminants , as well as inorganic contaminants . While the societal, environmental and economic benefits of rehabilitating these sites can be extensive (estimated within Canadian metropolitan areas at 4.6–7 billion dollars annually ), an additional benefit of the use of crops such as willow is that the biomass yielded per hectare can also be utilised for several valuable end-uses, including lignocellulosic biofuels , renewable electricity and heat generation , as well as green phytochemical production. Cultivation of biomass is often seen as the most substantial hurdle for economically feasible bioenergy production [17,18,19]. By aligning feedstock production strategy with land decontamination, cultivation can serve as a positive value-stream, in terms of the financial profit as well as a clear local environmental benefit .
To achieve effective success as a phytoremediation crop, willow is thought to maintain and exploit intimate symbiotic relationships with fungi. Roughly 6000 species of fungi from Glomeromycota, Ascomycota, and Basidiomycota have been categorised to date as mycorrhizal . Recent research has explored the symbiotic interaction of arbuscular mycorrhizal fungi (AMF, endomycorrhizal fungi currently exclusively categorised within Glomeromycota) with willow [13, 21, 22] although trees and shrubs such as willow are characteristically known for interacting with ectomycorrhizal (ECM) fungi when mature [20, 23]. Such interactions are thought to be predicated on the exchange of nutrients from the fungi to the plant, in particular phosphate and nitrogen attained by the fungi from soils, and a highly controlled amount of sugars exchanged from the plant to fungi. However, the comprehensive identification of fungi from an extra-laboratory environment, let alone delineating their roles in a complex biological system, is confounded by culturing difficulty and that, of the estimated 1.5 million fungal species, less than 600 have been currently (2016) sequenced and annotated (JGI MycoCosm ).
The peril of confounding bacterial community assessment by culturing methodology alone is also widely acknowledged, but the progression of contemporary sequencing techniques and extensive research advancement driven by fields relating to the human microbiome has led to an understanding of the ubiquitous, exceedingly diverse, presence and involvement of bacteria in eukaryotic biology. The importance of bacteria to mycorrhizal fungi and/or plant health has been established even in conditions less challenging than anthropogenic soil contamination [25,26,27,28]. Bacteria whose presence and function is beneficial to mycorrhizal fungi and/or plants are often termed mycorrhizal helper bacteria MHB [29, 30] or plant growth-promoting bacteria (PGPB), such as the recently identified Enterobacter sp. 638, recognised as improving poplar growth on challenging marginal soils by up to 40% . In relation to the rhizosphere, the highly complex environment has been very well reviewed by Bonfante et al.  as comprising ‘tripartite’ interactions between plants, mycorrhizal fungi and bacteria. The distinction between the potential roles of bacteria within the rhizosphere, in terms of the level of host interaction (rhizospheric, extracellular interacting or intracellular) and the spectrum of interaction type (pathogeny, symbiosis or commensalism), is problematic; however, metagenomics and metatranscriptomics can help to unravel this complexity by allowing gene function to be observed.
Here, all the RNA assembled de novo from roots of 12 willow trees, pot-grown in either contaminated or non-contaminated soil from a former petroleum refinery, was annotated and differential gene expression from any and all organisms identified was explored to see if the functionality of a successful phytoremediation system can be elucidated.
Contamination composition, experimental design and sampling
Both contaminated and non-contaminated soils were gathered from the site of a former petrochemical refinery at Varennes, Canada. Contaminated soil had an average C10-C50 petroleum hydrocarbon concentration of 912 mg kg− 1 (non-contaminated soil was below detection limit: < 100 mg kg− 1). Salix purpurea cv. ‘Fish Creek’ cuttings were established for 8 weeks in conventional potting media before being transferred to 20-l pots containing treatment soil within a larger experiment consisting of six blocks where randomised contamination effect was investigated (further soil and experimental information available from Yergeau et al. ). Growth conditions were 16 h 20 °C day and 8 h 18 °C night with excess watering and individual plant pot saucers to reduce leeching. After 6 months of growth, roots were harvested from six replicate trees per growth condition; soil was removed manually and roots samples were flash frozen in liquid nitrogen within 5 min of the initial perturbation.
RNA extraction and Illumina sequencing
A modified CTAB protocol [32, 33] was used to extract RNA from roots with quality and quantity assessed using a BioAnalyser (Agilent, Mississauga, ON, Canada). Roots from 12 trees were sequenced: 6 from trees cultivated in contaminated soil and 6 from non-contaminated soil. Polyadenylated mRNA was amplified using Ambion’s MessageAmp™ II aRNA Amplification Kit. Amplified RNA was tested for genomic DNA content by PCR, using 18S rRNA gene primers and conditions described in Stewart et al. . Indexing of cDNA samples for sequencing was performed in accordance with Meyer and Kircher . The samples were sequenced (four separate runs) using an Illumina HiSeq 2500 platform.
De novo metatranscriptome assembly and differential expression
Trimmomatic  was used to trim nucleotides of poor quality and reads < 40 bp were removed. Reads from all 12 biological samples were assembled into a de novo transcriptome using Trinity software set to default parameters . Assembled contigs shorter than 200 bp were discarded. Bowtie2 [37, 38] was used to align the RNA-seq reads back to the de novo transcriptome with -a -X 600 parameters on top of default parameters. Low count contigs, below 29 total counts across all libraries, were filtered out (corresponding to those contigs below the 80th percentile ). Read back-mapping rates were an average of 64.91% across all 12 root samples. Raw and normalised transcript abundance (tpm) was calculated using eXpress  with default parameters. Differential expression (DE) was estimated using EBSeq based on median normalisation  and on an empirical Bayes model framework [42, 43], keeping contigs with a posterior probability of differential expression (PPDE) ≥ 0.95 (with target false discovery rate controlled at 5%). EdgeR  was used to generate MA plots based on trimmed mean of M-values normalisation method . Extended quality control, assembly and normalisation information is provided in Additional file 1, and scripts are provided in Additional file 2.
To find coding regions within bacterial polycistronic sequences, we used TransDecoder software (https://transdecoder.github.io/). Protocol was followed according to the manual with default parameters. Precedence of transcriptional unit structure (putative operons) was verified in all cases against the database of prokaryotic operons (DOOR ) unless otherwise stated. A final hand curation step was included.
Interrogation of unknown contigs
Further annotation was undergone for contigs with no hit on any database. Nucleotide blast was carried out using non-coding RNA database (NONCODE2016 from www.noncode.org), Salix purpurea 94006 genome, and NCBI ESTs database.
Results and discussion
Assembly, mapping and annotation
A total of 189,849 confident contigs were confidently assembled de novo from 456,182,049 sequence reads. Overall, 125,151 contigs (66%) were confidently annotated as potential protein encoding transcripts, but 64,698 contigs (34%) had no protein homologue in SwissProt, Trembl or nr databases and are referred to as unknown henceforth (a substantial number of these were DE and investigated in greater detail in Additional files 1 and 3). The annotated transcripts included 91,053 Eukaryota (48%), 33,222 Bacteria (18%), 187 Archaea (0.01%) and 139 viral sequences (0.01%; including the common Illumina spike) (Fig. 1). Transcripts best annotated from eukaryotes were, as could be expected, mostly derived from Viridiplantae (46,817), although a close second was Fungi (40,352), with Metazoa (1417), Amoebozoa (918), Stramenopiles (617), Euglenozoa (390), Alveolata (198) and others (such as metagenomic or environmental samples) comprising only minor proportions of the transcript diversity (324).
Total community makeup
Comparative multi-omics analysis by Hultman  and Tveit et al.  has revealed some shortfalls in community assessment using internal transcribed spacer (ITS) and 16S ribosomal RNA (16S rRNA), even at the phylum level (discussed with additional Basidiomycota identification investigations in Additional file 1). Others have expressed even stronger scepticism, such as Bent et al. , who discussed the difficulties in providing reliable diversity indices from microbial fingerprinting methodology [52,53,54,55,56,57]. An alternative, top-down approach is employed here which accepts uncertainty of the microbiome community, assembling as many contigs as possible within the system and performing differential expression analysis on all retained contigs without any necessity for a priori sequence information before independent annotation using the world’s major protein repositories.
Transcripts clearly identified as deriving from Salix (the mass majority from the Salix purpurea 94006 genome) made up 85.6% (40,098) of the contigs from plant species. While this was expected, it is perhaps more informative to recognise that 14.4% of plant transcripts were best annotated outside of Salix, mostly from the very close relative Populus (9.0%) but also from other genera (Vitaceae 0.6%; Cenchrinae 0.4%; Funariaceae 0.4; Acalypheae 0.4%; Oryza 0.2%; Amygdaleae 0.2%; Arabidopsis 0.2% and others 2.8%). This is roughly in line with previous research suggesting between 6 and 10% of Salix purpurea genes can be better identified by querying the broader protein repositories [2, 46] and that, importantly, over 10% of likely Salix gene expression can be obscured by only mapping to a reference genome, even when species specific.
Extraordinary fungal diversity was found within the transcriptome. Although RNA was extracted from washed roots, this diversity is not surprising as the majority of plant species are thought to employ (widely diverse) below-ground fungal association in order to survive and compete in the biosphere [12, 58]. De novo sequencing provides a unique opportunity to observe such opaque below-ground genetics (Fig. 1, only fungal genera with over 1000 hits are presented, comprehensive annotation is provided in Additional file 4 and interactive Krona graphs). The fungal community comprised 40,352 distinct contigs by primary annotation, the majority were Pyronemataceae (23.8%), then Hydnangiaceae (11.7%), Tuberaceae (8.0%), Polyporaceae (6.3%), Gloeophyllaceae (3.3%), Hymenogastraceae (3.3%), Marasmiaceae (2.9%), Serpulaceae (2.7%), Psathyrellaceae (2.7%) and Pleosporaceae (2.6%) contigs as well as a very extensive quantity of others (13,228 contigs or 32.8%).
Bacterial diversity was extremely high, even at class level, with the 33,222 transcripts originating from Betaproteobacteria (21.4%), Gammaproteobacteria (14.7%), Actinobacteria (14.1%), Alphaproteobacteria (14.0%), Deltaproteobacteria (12.6%), Bacteroidetes (3.9%), Verrucomicrobia (3.7%) and Firmicutes (3.2%) as well as from other genera (Planctomycetes 2.3%; Chloroflexi 1.8%; Cyanobacteria 1.7%; Acidobacteria 1.6%; uncharacterised bacteria 1.5%; Spirochaetes 1.1%; Deinococcus-Thermus 0.4%; Gemmatimonadetes 0.3%; others 1.6%). These bacterial contigs assembled from RNA extracted from roots very closely reflected the community reported in a previous experiment performed by Yergeau et al.  using 16S rRNA sequencing, where rhizospheric bacteria (from soil attached to roots), in association with the same willow cultivar grown in the same contamination, was also dominated by Betaproteobacteria (~ 39%) and Gammaproteobacteria (~ 38%). The authors used 16S rRNA sequencing to positively identify a wide spectrum of bacteria at the class level; while the methodology is not quantitatively comparable to that used here (Additional file 1), the proportions of the community makeup were roughly similar at a phylum and class level.
Differentially expressed contigs
Metaorganism patterning in gene expression
A total of 12,576 contigs were identified as DE between non-contaminated and contaminated soil conditions (Fig. 2). DE contigs (putative transcripts when annotated as protein encoding) may be responding directly to hydrocarbon toxicity or to environmental alterations due to the high soil concentrations of petroleum hydrocarbons, such as nutrient or water availability. Beyond this, differential expression could also be due to secondary responses representing the shift in the microbiome community or, indeed, expression of other organisms independent of the shift in community within the metaorganism. No sequences were discarded based on annotation, especially since previous studies have demonstrated the potential for the removal of sequences (unexpected a priori) to biologically  and technically  confound data integrity and interpretation. The necessity for altering the paradigm of experimental approaches in light of a modern understanding of metaorganismal complexity has recently been recognised as essential in mammals with respect to phenotypic assessment incorporating the microbiome . In particular, bacterial DE sequences were retained for analysis (despite polyA enrichment prior to RNA-seq).
A substantial proportion of all the contigs assembled (including non-DE) were unknown (34%), bearing no confident homologous sequence in the major protein repositories. Although unknown sequences are often discarded, the strategy reported here very deliberately maintains data where possible due to the potential for unknown DE sequences to represent novel organisms and functionality. Over 20% of the total DE sequences were indeed unknown (Additional file 3); extended investigation into these sequences is discussed in Additional file 1.
Pooling of highly complex gene expression data from non-model plants into ontology groups can give the impression of an overview of the biological system (GO terms are provided for query Additional file 4); however, an attempt to explore DE function at a transcript level across all the organisms within the samples is presented here.
A total of 839 DE contigs were annotated as Salix purpurea, representing 6.7% of all DE contigs from roots. The number of contigs up (45%) or down-regulated (55%) was similar in response to contamination (Figs. 2 and 3). Despite the strain/cultivar studied here (S. purpurea ‘Fish Creek’) being the same species as that of the sequenced reference genome, S. purpurea 94006, 131 contigs (which we presume originate from Salix) were better annotated from other plant species. This is consistent with another study suggesting (top-down) de novo strategies can capture more sequence variation than mapping to a single non-clonal reference genome alone . Although the diversity of Salix DE contigs was relatively low within the system, with the exception of the unknown, they had the highest relative abundance (analysed as transcripts per million, TPM); perhaps unsurprisingly, given RNA was sampled from washed Salix roots.
Direct detoxification/stress responses
The expectations for plant gene expression within a high hydrocarbon-contaminated environment would include an increase in cytochrome P450 monooxygenases. Out of 26 DE transcripts identified as encoding cytochrome P450 family proteins, 18 were in higher abundance within control roots (cultivated in non-contaminated soil) (Additional file 5). Only two were annotated as putative monooxygenases (c585325_g1_i1 and c601406_g5_i2), with one upregulated and one downregulated with respect to contamination. Another expected set of stress-induced detoxification equipment, known to be transcript abundance dependent, are glutathione S-transferases (GSTs), which catalyse glutathione conjugation to a broad range of xenobiotics in order to facilitate vacuolar loading (compartmentalisation of cellular pollutants). Only a single GST was downregulated in roots from contaminated soil, with 13 upregulated (8 tau class, including GSTU19, 37, 48 and 50) at consistently high relative abundance levels (tpm) (Fig. 2 and Additional file 5).
UDP-glucose:flavonoid 7-O-glucosyltransferase was the most abundant upregulated Salix gene in contaminated roots (c600230_g5_i1; 216.94 tpm), and two other putative isoforms were also highly upregulated (c600230_g5_i2, 90.71 tpm; c600230_g6_i1, 52.53 tpm; Fig. 2).
Such increases in flavonoid glucosyltransferase expression represent a common conjugation mechanism in plant detoxification metabolism . Alternatively, UDP-glucose:flavonoid 7-O-glucosyltransferase, which allows symbiotic interaction with microorganisms in Glycine max , can be limiting for vacuolar loading of the isoflavone conjugates, and the corresponding conjugate-hydrolysing β-glucosidase is localised to the plant root apoplast, intriguing in terms of the extensive interaction with microbes evidently ongoing within the roots.
Apart from the UDP-glucose:flavonoid 7-O-glucosyltransferase and glutathione transferases, the most highly expressed DE transcripts in contaminated roots were aquaporins and dehydrin. This could be expected given petroleum hydrocarbon contamination is likely to reduce access to water and thus induce osmotic stress in roots. Three PIP1.1 transcripts (c483320_g1_i2, 125.11 tpm; c589371_g5_i4, 88.10 tpm; c589371_g5_i8, 30.63 tpm), three PIP2.1 (c599604_g6_i1, 79.73 tpm; c576656_g7_i3, 70.73 tpm; c599604_g8_i6, 4.23) and PIP2.7 (c602203_g2_i9, 21.70 tpm) were all highly abundant in contaminated conditions. Beyond a direct plant response to soil conditions, it has long been known that root-associated fungi, particularly ECM fungi, can improve not only nutrient uptake in trees, but also water [63,64,65]. It is interesting to note, in light of subsequent fungal gene expression, that extreme PIP upregulation has previously been observed as induced by the ECM fungi Amanita muscaria in fine ectomycorrhized roots of poplar (a close relative of Salix, sharing macrosynteny)  as well as in Salix by the AMF Rhizophagus irregularis .
Nitrogen and carbohydrate transport
Five amino acid transporters were DE, two in higher abundance in non-contaminated trees and three in contaminated trees. Two were distinct cationic amino acid transporters, one in higher abundance in non-contaminated conditions (c598534_g2_i3, 6.65 tpm) and one higher in contaminated conditions (c584679_g1, 18.25 tpm). Another of the transcripts higher in contaminated conditions (c594518_g1_i1) was most similar to AAP6 in poplar (87% identity, blastn e-value = 7e-29), and the two remaining transporters (one in higher abundance in each condition) were both similar to vacuolar amino acid transporter 1. AAP6, an acidic and neutral amino acid transporter known to be expressed in roots, is distinctive in having high substrate affinity (so could potentially be relevant to low amino acid concentration uptake) as well as having an affinity for aspartate . Seven nitrate/peptide transporters where DE with six in higher abundance in contaminated conditions, sharing close homology with poplar NPF (NTR/PTR family) 5.4, 3.1 and 5.2 proteins. Alongside PIP expression, these are hallmarks of root symbiosis  in terms of differential expression and are thought to represent oligopeptide import of AM- and ECM-packaged nitrogen, the principal fungi to plant facet of resource exchange.
In terms of potential concomitant plant to fungi exchange of resources, differential expression of a putative sucrose transporter (c582330_g6_i4) with high abundance in roots from contaminated soil was identified, which would be expected if increased ECM interaction were underway in these conditions. Three distinct trehalose-6-phosphate synthase transcripts were DE (c598745_g5_i1, c536396_g3_i1, c599008_g3_i2), all in higher abundance in roots cultivated in contaminate soil. While the role of trehalose synthesis in plants is somewhat obscure , trehalose is the dominant storage carbohydrate in AMF hyphae and implicated in ECM abiotic stress tolerance [70, 71], so expression here is intriguing in light of concomitant fungal DE. Additionally, the sugar transporter SWEET2 (c555872_g1_i5) was in higher abundance in roots from contaminated soil. SWEETs (Sugars Will Eventually Be Exported Transporter) have only been identified relatively recently  but have already been strongly implicated in plant-fungal interactions , including Solanum tuberosum roots colonised by AMF . In Arabidopsis, SWEET2 has been shown to accumulate in root hairs, cap and epidermis, and in cells in close contact with the rhizosphere . Interestingly, the authors suggest that SWEET2 functions as a bidirectional glucose transporter that could control glucose secretion through limitation/vacuolar loading as part of highly complex and co-ordinated interactions with rhizospheric microbes.
The most abundant downregulated contigs in Salix roots from non-contaminated soil encoded two distinct organ-specific S2 proteins (Fig. 2).
While these proteins are of unknown function, they have been found to be downregulated in a Medicago truncatula 1-deoxy-d-xylulose 5-phosphate synthase 2 (catalysing the beginning of the MEP pathway) knockdown, which also reduced AMF colonisation of roots . An acidic endochitinase (EC 220.127.116.11, c583738_g7_i2) was upregulated in contamination-treated roots. While this would intuitively be associated with plant defence against pathogenic fungi, a number of studies have found that this endochitinase can encourage symbiotic association with ECM fungi, such as Hebeloma, Suillus, Wilcoxina, Pisolithus, Paxillus and Amanita with spruce, birch and Eucalyptus; the suggested mode of action being that non-fungi-damaging apoplastic chitinase modifies ectomycorrhizal elicitors to facilitate symbiotic interaction [77,78,79,80]. When considered alongside the preceding data, Salix root DE reflects plant recognition of foreign molecular patterns , the non-recognition impact of effectors  as well as alterations in nutrient exchange and water availability known to be drastically altered by fungal interactions .
Division by taxonomy
Fungi represented the kingdom with the most diverse genetic response to contamination within the root samples, being the annotation source of the 8252 distinct DE contigs. The paradigm of direct up vs down-regulation collapses when multiple organisms and natural biological complexity are acknowledged in sequencing studies (when the dynamic nature of the metagenome is considered ), as the presence of an organism can vary as well as gene expression within organisms. Therefore, DE contigs can represent direct or indirect responses to treatment within a given organism of stable presence (with respect to treatment), including responses to highly diverse, changeable and potentially hostile biological environment, but also represent baseline transcription and metabolism of newly present, absent, growing or diminishing organisms within the system. The contigs annotated as fungal had very distinct patterning in relation to contamination; of the contigs present in both conditions, 88.8% (6184 contigs) had higher abundance in non-contaminated trees while, in contrast to this, of the contigs only present in trees of one condition, 96.3% (1239 contigs) were in contaminated conditions (Figs. 2b, c; 4a and 5). These expression patterns, even before investigating gene function, can be interpreted as a general downregulation of constitutive fungal expression due to contamination and the arrival of a distinct, contamination specific, set of expressing genes potentially representing a shift in community makeup. By comparison, only seven Salix DE contigs were expressed in only trees of a single condition.
The two Ascomycota species most represented within DE transcripts were the closely related Pyronema omphalodes (a saprophyte) and Tuber melanosporum (an ECM fungi) (Fig. 4b and Additional file 6). These species both dominated fungal differential expression and had high enough levels of primary annotated contigs (2303 contigs and 1077 contigs) to merit very confident identification of these species, or of close relatives, within the system. Transcript levels were almost all more abundant within roots of non-contaminated trees (94.1 and 98.4% for P. omphalodes and T. melanosporum, respectively). While an alteration in gene expression in response to hydrocarbon conditions is possible, the extremity of polar expression might also suggest a lower abundance for these organisms due to the impact of contamination stress and/or increased competition from more contamination-tolerant life forms. Tuber has been previously identified as associated with Salix [83, 84]; non-constitutively expressed ECM high-abundance markers were DE in Tuber and were indeed in higher abundance in non-contaminated conditions, such as RAS protein (c594475_g3_i1) , implying successful Tuber ECM interaction was underway in non-contaminated conditions but was suppressed by hydrocarbon contamination.
Unlike Ascomycota, Basidiomycota expression was dynamic with respect to contamination, as 1639 contigs annotated from three closely related species (from the Agaricoid family’s Hymenogastraceae and Strophariaceae ) Heboloma cylindrosporum (an ECM fungi), Galerina marginata (predominantly white rot) and Hypholoma sublateritium (white rot) were downregulated, while 1745 contigs, with a highly distinctive expression pattern, were upregulated but broadly annotated from 61 genera (Fig. 4c and Additional files 6 and 7). These upregulated contigs were distinctive in the extreme diversity of annotation origin and as the majority (71%) were present only in roots from contaminated soil. To investigate the origin of these phytoremediation responsive contigs, and provide additional evidence towards their species of origin, we explored the species diversity of equally good hits using ‘secondary annotation’  (Fig. 4d; Additional file 1) as well as using nucleotide blastn (a strict blastn reduces observation of fungal data by 99.5%, Additional file 8).
By utilising secondary annotation, the retention of equally (statistically) ‘good’ annotation hits, all 12,576 DE contigs can be more confidently assigned to an organism of origin by elucidating any ambiguity in annotation which is often overlooked. The set of 1745 Basidiomycota genes most often had secondary annotation from known ECM (such as Scleroderma citrinum and Paxillus involutus) and saprophytes (such as Pleurotus ostreatus and Trametes versicolor), in agreement with the current awareness that the continuum of mutualism is complex . Very little crossover was observed between Basidomycota and Ascomycota in terms of differential expression due to contamination response, as Basidomycota extensively dominated the increased expression in response to contamination conditions (Fig. 4a–d). Secondary annotation provides an additional benefit in terms of mining useful functional description from across the world’s major data repositories (Additional files 2, 3, 4, 5, 6, 7, 8 and 9), but it is also interesting to note that retention of secondary annotation independently reflects classical Basidiomycota phylogeny (; Fig. 4d) (albeit with a crude thresholds limiting data retention to only the very closest homologues), particularly interesting is if this idea is considered in the context of the 189,839 total annotated contigs that include 491,505 secondary annotation (Additional file 4).
An important take home message here, when querying unknown sequences using BLAST, is the benefit of not taking the single top returns as fact. Pertsemlidis and Fondon  have detailed how differentiating proteins of high homology from BLAST scoring system should be performed with caution and others have outlined the extensive risk of confounding biological interpretation by not acknowledging the uncertainty of annotation . Secondary annotation of Basidiomycota was spread widely across Agaricomycotina for upregulated contigs whereas downregulated contigs (> 95%) derived from a specific Agaricoid clade comprising the three closely related genera (Hypholoma, Galerina and Hebeloma) (Fig. 4c) which, understandably, shared the majority of secondary annotation with each other (Fig. 4d). For the distinct, upregulated Basidiomycota, secondary annotation revealed no dominant source of annotation (homology) to our contigs within Agaricomycotina (more detailed sequence origin investigation is presented in Additional files 1 and 8).
Upregulated Basidiomycota function
Given that the intricacy, or perhaps futility, in differentiating between saprophytes and ECM fungi has been well discussed [89, 90], determining the fungal mode of action from expression study is problematic  and is further confounded by the generally accepted belief of multiple ECM evolutionary events in Basidiomycota (). The scale of mycorrhizospheric complexity is very high; once DE genes have been sub-selected as best annotated from fungi (keeping in mind that 2575 DE contigs were unknown, having no known sequence similarity), further sub-selected based on Basidiomycota annotation and even further sub-selected based on response to contamination (just those contigs in higher abundance in contaminated conditions), 1745 DE-annotated transcripts remain to describe potential functionality (driven by positive expression) within the system. Of these 1745 DE transcripts, 70% (1227 contigs) had unknown function and were annotated as hypothetical, predicted, putative or uncharacterised proteins from Basidiomycota species. These poor functional description terms are selected against during annotation if an equally good hit is available in secondary annotation, in practice re-mining any confident functional descriptions available within the major protein data repositories.
In terms of recognisable gene function, the most abundant fungal contig expressed in contaminated conditions was the relatively cryptic plasma membrane proteolipid 3 (Pmp3), best annotated from Moniliophthora, with 301.42 tpm (c553133_g5_i1, 2128.20 FC) (Additional file 7). The small hydrophobic Pmp3 (dissimilar to recognised hydrophobin structure) has been characterised as highly conserved in fungi, is environmental stress induced (cryptic in yeast under standard laboratory conditions), involved in cytotoxic cation tolerance  and sphingolipid synthesis  so could be related to membrane integrity maintenance given the challenge of contamination conditions. Generally, the most highly expressed contigs within the group of distinctive Basidiomycota (Figs. 4c and 5) were not putatively related to hydrocarbon degradation but, instead, were related to classical association with plant roots and/or the highly upregulated bacteria (Enterobacteriaceae sp.), namely carbohydrate import, nitrogen (nitrate, ammonia and amino acid) metabolism and transport, and bacterial interaction. This could be expected in light of the hydrocarbon degradation mechanisms DE in the Enterobacteriaceae sp. (outlined below) if the interaction did indeed involve some degree of tripartite mutualism.
Putative hydrocarbon degradation
Surprisingly none of the expected hydrocarbon degrading monooxygenases were identified within the 1745 DE Basidiomycota contigs which increased in abundance due to contamination (Fig. 5, Additional file 7). Only three dioxygenase encoding contigs were identified as DE, all of which had relatively low abundance levels. Two were poorly characterised dioxygenase family proteins (c596412_g1_i1, 8.78 tpm and c593762_g1_i1, 5.93 tpm), but one was a putative extradiol aromatic ring-opening dioxygenase (c598058_g2_i2, 4.23 tpm), functionality well recognised in bacterial PAH degradation studies [95,96,97] but less familiar in fungi. No lignin peroxidases, manganase-dependent peroxidases (class II peroxidases) or laccases (with potential degradation functionality ) were identified as DE, expected if a ligninolytic (white rot) mode of action was underway, particularly as the associated enzyme secretion is highly dependent on transcript levels . Two DE contigs, however, did encode glutathione peroxidase-like proteins (c591991_g1_i5, 8.78 tpm; c591991_g1_i10, 2.14 tpm), and one encoded a thioredoxin-dependent peroxidase (c565655_g1_i1, 48.94 tpm) were upregulated, although this is a common response to contamination-induced oxidative stress .
Carbohydrate transport and CAZy
One of most intuitive approaches to distinguish between saprophitic and ECM ecological strategies (the most likely in Basidiomycota here) would be to compare the expression of genes for plant cell wall degradation and carbohydrate import mechanisms, although this is a non-trivial task . For instance, the molecular mechanisms underpinning much of the carbon transfer to ECM fungi from plants is unclear; while there is strong evidence that up to 30% of the plant’s total photoassimilates can be transferred to ECM fungi , few hexose transporters have been experimentally validated. The monosaccharide transporter MST1 in both Amanita muscaria (fourfold upregulation during symbiosis ) and Laccaria bicolor  being the exception to this.
The third most abundant (275.43 tpm, Table 1) Basidiomycota DE contig positively responding to contamination was a probable monosaccharide transporter (c601571_g1_i1; equally well annotated from either Serendipita vermifera or Piriformospora indica, Additional file 7), as well as two similarly annotated putative isoforms (25.71 tpm and 18.19 tpm) (Fig. 5 and Table 1). This is especially noteworthy as Hynson et al.  detected potential ECM lineages within Serendipitaceae and because P. indica has been shown to stimulate plant growth, but the mechanics of the symbiosis is somewhat cryptic (and can even cause plant cell death ). Interestingly, Zuccaro et al.  also found that this monosaccharide transporter was highly upregulated in Piriformospora indica and identified expression as clearly associated with barley root biotrophism (as opposed to saprotropism). The Saccharomyces cerevisiae homologue (e-value = 5e−17, 67% identity) of this MST, the extracellular glucose sensor rgt2, is well studied as also having sensor functionality [99, 105]; however, given the extremely high relative abundance, we would speculate high glucose import function as more likely to drive differential expression (due to increased transcript dependency of function). This probable monosaccharide transporter (c601571_g1_i1) is also similar to the abovementioned Laccaria bicolor MST1.3 (monosaccharide importer CAQ53118.1; lacbi1:301992, blastx e-value = 2e−21, 48% identity), whose crucial role in ECM glucose import is further supported by 14C-labelled glucose trials which demonstrated not only strongly upregulation during ECM formation when compared to extraradical mycelium expression but also strong evidence of substantial glucose import functionality . Interestingly, in Laccaria bicolor, glucose uptake by MST1.3 was only very slightly affected by the presence of fructose (whereas uptake was inhibited by fructose in others, allowing for the possibility of sucrose hydrolysis at the rhizospheric interface). Alongside potential plant cell wall binding machinery, glucose import represents a highly expressed, recognisable function within the Basidomycota transcriptionally responding to contamination conditions and seems a credible candidate describing the currency of plant to fungi symbiosis with Salix roots.
Basidomycota carbohydrate active enzymes were identified as upregulated in response to contamination conditions, with the majority only expressed in contaminated trees (only 9 contigs were also expressed in control trees). Of these, 20 contigs belonged to one of nine glycosyl hydrolase (GH) families, eight contigs to one of six glycosyl transferase (GT) families and one contig was a pectin/pectate lyase (PL; pectate lyase 1). The most highly represented group was GH13, with seven contigs (5 distinct enzymes including 3 likely isoforms) including a secreted alpha-amylase. It is possible that these may represent extracellular invertase activity (sucrose hydrolysis) as opposed to native fungal α-glucan metabolism.
The most abundant CAZy contig (104.28 tpm) was a GH45 Barwin-like endoglucanase (expansin family protein) (Table 1). Expansin domain containing proteins have previously been recognised in Laccaria bicolor as expressed only in ECM tissue , which is fascinating in light of the functional similarity to cell wall loosening in plants  and if considered in the context of non-necrosis inducing Salix root tissue remodelling and potential interaction with middle lamella and primary cell walls. In agreement with this, a carbohydrate esterase family 12 protein (putative rhamnogalacturonan acetylesterase, pectin-related)  annotated only from the orchid symbiote Tulasnella calospora was DE. Further to this, a pectin/pectate lyase was also DE, recently identified as one of the few cell wall-degrading enzymes retained by the ECM fungus Tuber melanosporum for cell wall remodelling of Corylus avellana as part of their symbiotic interaction . The second most abundant CAZy was GH131 (best annotated from Hebeloma but with equally good hits in Plicaturopsis, Laccaria, Jaapia, Tulasnella and Gelatoporia), including a cellulose binding module (CBM1) with broad β-glucanase activity. While this activity allows for the potential for saprophytic action, remodelling of the plant cell wall and progression through lamella is in agreement with ECM colonisation of plant roots towards the effective formation of the hartig net  and, interestingly, GH131 has previously been identified as expressed in Jaapia argillacea as a potential adaption from saprotrophic ancestors . Three GH5 contigs (exo-beta-1,3-glucanase) were also highly expressed including an ectomycorrhiza-upregulated exo-beta-1,3-glucanase [106, 111]. In disagreement with this is the lack of identifiable lignases or maganase-dependent peroxidases necessary for deconstruction of heavily lignified middle lamella as well as the differential expression of galactose oxidase (AA5), a secreted extracellular catalyse which has twice as much activity on galactomannan (mannan backbone with galactose sidechains) compared to galactose , important given this very common hemicellulose component is more indicative of plant secondary cell walls .
In terms of the potential direct interface between the plant and putative ECM Basidiomycota, two Stereum hirsutum ricin B-like lectins of the carbohydrate-binding module family 13 (CMB13) contigs were very highly abundant in treated conditions at 144.48 tpm (14th most abundant, 1350 FC) and 98.05 tpm (22nd most abundant, 350 FC). Interchangeable homologous hits were also found using secondary annotation within Heterobasidion, Fomitiporia and Jaapia. Lectin carbohydrate-binding is well documented as one of the means of intense binding at the interface between the ECM hartig net and the plant host cell wall [114, 115]. CMB13 has been shown to have specificity for backbone xylan (such as the majority of hemicellulose within willow root 2° root cell wall), although is also found in a diverse array of non-xylanase glycosyl hydrolases .
The metabolic machinery necessary as a downstream consequence of monosaccharide import (in light of very highly expressed MST) includes the Embden-Meyerhof (EM) pathway, pentose phosphate pathway and tricarboxylic acid cycle (Additional file 7). Trehalose and glycogen metabolism within the EM pathway, clearly represented within DE contigs, are glycolytic pathways expected in ECM fungi . Alpha,alpha-trehalose-phosphate synthase (TPS1), two trehalose-6-phosphate phosphatases (TPS2), two glycogen synthases, glycogen phosphorylase, glucokinase regulator, hexokinase, phosphoglucomutase, glucose-6-phosphate isomerase, two phosphoglycerate kinases, two phosphoglycerate mutase-like proteins, enolase, pyruvate kinase and two fructose-1,6-bisphosphatases were identified as DE and in higher abundance in root from contaminated soil. Within the pentose phosphate pathway: glucose-6-phosphate 1-dehydrogenase, 6-phosphogluconolactonase, two 6-phosphogluconate dehydrogenases, transketolase and transaldolase were also DE. While carbon was fated towards ethanol fermentation (pyruvate carboxylase was DE), the citric acid cycle was also comprehensively represented as upregulated: pyruvate dehydrogenase (E1 component subunit alpha; acetyltransferase component), four phosphoenolpyruvate carboxykinases, pyruvate carboxylase, three succinate dehydrogenases, two fumarate reductases, succinyl-coa synthetase (alpha chain and beta chain), three malate dehydrogenases, 2-oxoglutarate dehydrogenase, two homocitrate synthases, two aconitate hydratases and isocitrate dehydrogenase, supporting carbon scaffold assembly towards amino acid production.
This extensive representation of the citric acid cycle upregulated in Basidiomycota from contaminated conditions is relevant in the context of potential amino acid export to Salix roots, given very similar ECM expression profiles in ECM Laccaria and Tuber [118, 119], and considering the differential expression of contigs encoding glutamate synthase and glutamine synthetase (Table 1). Contigs encoding N-related transport machinery were highly expressed (Table 2), including two ammonium transporters, which were some of the most highly abundant Basidiomycota contigs present in the samples (AmtB, tpm 66.39; Amt1, tpm 64.18) as well as a very highly expressed contig encoding a putative FUN34-transmembrane protein involved in ammonia production (tpm 47.03). Additionally, a peptide/nitrate transporter, five amino acid transporters, a purine transporter, a glutathione transporter, two oligopeptide transporters (OPTs), two plasma membrane h+ symports, two mitochondrial carriers as well as urease, amine oxidase, D-aspartate oxidase and two carbon-nitrogen hydrolases were upregulated, all enzymes involved in the reduction of organic nitrogen compounds and ammonia production (Additional file 7). This wide ranging selection of N-related compound transporters identified as DE very closely matched those previously reported in the ECM fungi Paxillus involutus  and Laccaria bicolor .
A broad suite of proteases was also DE including four endopeptidases (including subtilisin and bleomycin), two Zn-dependent exopeptidases, three carboxypeptidases (glutamate, serine and zinc), four metallopeptidases, a methionine aminopeptidase, two putative aminopeptidase isoforms and two distinct aspartic peptidase A1 (Table 2). Secreted suites of proteases are thought to be similar between Basidiomycota saprotrophs and EMC . However, the extracellular protein degradation pathways identified here distinctly match those expressed in the EMC P. involutus . One of the markers identified from Laccaria biocolor but recognised across ECM are Ras/Ras-like proteins [85, 122, 123], small GTPases involved in cargo sorting in coated vesicules  (vesicular turnover is thought to be high in ECM due nutrient and signal exchange with hosts ) that have been shown to increase in transcript abundance after successful establishment of symbiosis . Eleven Ras-like proteins were DE and in high abundance in the phytoremediation responsive Basidiomycota, importantly including likely Ras, or Ras interacting, proteins known as host interaction-specific ECM markers identified in Laccaria . These comprised three highly abundant uncharacterised small GTPase-binding proteins (47.83 tpm, 31.86 tpm, 9.42 tpm), Rho1 (20.67 tpm), Ras-related protein Rab-5B, Ras protein, Rab-type small GTPase, Ras GTPase-activating protein, Rab GDP-dissociation inhibitor, GTPase foz1, Sar1-like protein member of Ras-family as well as an extremely highly expressed Rab5 (ypt5) GTPase family protein (57.28 tpm), involved in endocytotic vascular trafficking .
While a clear depth of functional detail can be revealed by metatranscriptomics, the majority of putative fungal proteins were uncharacterised (having no functionally characterised homologue), a useful reminder that the vast majority of the natural world remains to be explored.
Polyadenylation in bacteria
Polyadenylation of RNA has long been known to occur in bacteria [128, 129]. Some studies have established the potential involvement of polyadenylation in mRNA degradation [130, 131] and RNA quality control [132, 133] in a limited number of bacterial species, although the contemporary picture of polyadenylation in bacteria suggests complexity beyond this, as recently described by Kushner . Using pulse-labelling, levels of polyadenylated RNA have been measured at up to 15% of RNA in Escherichia coli  (in contrast to those of up to 25% in gram-positive Bacillus brevis ) and more recent research using microarray analysis revealed, in an E. coli K-12 wild-type transcriptome, that 90% of transcribed ORFs underwent some degree of polyadenlyation .
Within the plant transcriptomic research community, bacterial mRNA is routinely discarded during early quality control of common bioinformatics pipelines as distinct from the target organism of study or discarded intrinsically when mapping to a reference genome. Gonzalez et al.  recently reported how the sub-selection of only transcriptomic sequences expected a priori can confound biological results, the example leading to S. purpurea biotic stress genes being misidentified as abiotic stress responsive genes due to a strong treatment-specific interaction of a plant herbivore (Tetranycus urticae). Similarly, Brereton et al.  demonstrated the potential for mistaken mapping (mis-mapping) RNA-seq reads from unexpected foreign organisms to technically confound results by mapping known foreign sequences to the S. purpurea genome. Given these potential pitfalls, it would seem prudent to acknowledge the complexity of extra-laboratory biological systems involving higher eukaryotes by investigating all mRNA molecules present within any sample even when strong expectations exist within the experimental design. The approach is limited here as an unknown absolute proportion (and community) of bacterial mRNA could be lost during polyA enrichment.
Despite polyA enrichment, an extremely broad diversity of bacteria was observed within the system (including contigs that were not DE; Fig. 6a)). The majority of these contigs derived from Proteobacteria, including Alphaproteobacteria (14%), Betaproteobacteria (34%), Deltaproteobacteria (13%) and Gammaproteobacteria (23%). Actinobacteria (14%), Bacteroidetes (4%) and Firmicutes (4%) were also highly represented. This community makeup is similar to that previously reported in metagenomic studies of contaminated soils .
A different picture was discerned in DE contigs (Fig. 6b); of 638 DE contigs, 86% were in higher abundance in contaminated trees. Enterobacteriaceae species were the most represented, annotating 72% of all the bacteria contigs. Importantly, 100% of these Enterobacteriaceae contigs were of higher abundance in contaminated trees (Fig. 2b, c), suggesting strong biological association with contamination and/or the other organisms. Therefore, their functionality was explored, albeit with caution. Of potential relevance to this, growth under more challenging conditions than standard rich media, or repression of growth using transformation, can increase the level of polyadenylated transcripts within E. coli [138,139,140].
Decoding polycistronic transcriptional units
Determining transcriptional units (putative operons) in bacteria is computationally problematic in terms of prediction from genomic sequence [141, 142]. While prediction of E. coli K12 operon structure is as advanced as in any bacteria, little confidence can be established in the transferability of such predictions given operon variation across organisms and very high responsiveness to environmental change. The unique nature of any extra-laboratory environment, such as that explored here, as well as the high likelihood of numerous novel organisms present (not the least being the bacteria themselves present within the system) further complicates transcriptional unit prediction. That being said, top-down de novo assembly provides very high confidence in the presence of observed sequences within the system, of particular technical value is the high requirement for equal mapping coverage applied within Trinity’s contig assembly . Of the 639 DE and potentially polycistronic contigs annotated as bacterial in origin, 3134 protein coding regions were identified using Trinity’s transdecoder. In total, 489 of these contigs were identified as deriving from species within the family Enterobacteriaceae (100% in higher abundance in contaminated trees) comprised 2834 proteins, although a substantial proportion were duplicates with only minor sequence variation (Additional file 10). These could represent common expression by distinct organisms or multiple operons from a single organism (being somewhat reminiscent of eukaryotic splice variants at a sequence level). The majority of the 2834 Enterobacteriaceae putative proteins were annotated from E. coli (68%, across a broad spectrum of strains, Additional file 10) or the genus Shigella (29%). A recent identification and genome sequence of a poplar growth promoting endophyte (Enterobacter sp. 638) by Taghavi et al.  shared similar levels of homology as those found here but at a genome level, with 69% of predicted CDS being similar to E. coli.
Upregulated bacterial gene function
A broad spectrum of contigs encoding proteins putatively involved in survival within a rhizospheric environment were identified as DE due to contamination conditions (Additional file 10), including those putatively interacting extracellularly and/or intracellularly with ECM hyphae and/or plant roots. As the vast majority of DE bacterial genes were best annotated as Enterobacteriaceae (either E. coli or Shilgella strains), E. coli K-12 MG1655 gene nomenclature was used where possible.
Hydrocarbon degradation and biosurfactant production
For PAH degradation, bacterial dioxygenase enzymes are considered the principal means of ring fission [26, 143] (although cytochrome p450 monooxygenase are common), whereas alpha-ketoglutarate-dependent dioxygenases (alkB) are one of the most recognised enzymes driving degradation of aliphatic hydrocarbons  (not to be confused with alkane monooxygenase, alkB). The contig c596278_g1_i5 was upregulated here (containing yojIalkBadaapbEmqo), functionally comprising an ABC transporter ATP-binding protein (yojI), alpha-ketoglutarate-dependent dioxygenase (alkB), regulatory protein (ada), thiamine biosynthesis lipoprotein (apbE) and malate:quinone oxidoreductase (mqo) (Fig. 7 and Additional file 10). Additionally to this, a number of genes well characterised as accompanying toluene tolerance [144, 145] were DE, including a putative operon which contained the toluene transporter subunits mlaCDEF and kdsCDlptACkdsCD encoding a lipopolysaccharide export system. Degradation of alkanesulphonates, previously identified as involved in crude oil degradation by metagenomic study , occurs through desulfonation and relies on expression of the operon ssuABCDE [146, 147] comprising an alkanesulfonate transporter subunit, a putative alkanesulfonate transporter subunit, an FMNH(2)-dependent alkanesulfonate monooxygenase, putative aliphatic sulfonate binding protein and NAD(P)H-dependent FMN reductase. Here, the entirety of the ssuABCDE operon was assembled and DE in higher abundance in contaminated conditions (two similar contigs c595976_g1_i2 and c595976_g1_i9). As would be expected alongside the ssu system, the associated tau system (comprising tauABCD genes) was also upregulated within two separate putative operons containing tauAB proteins (taurine transporter subunits A and B; c596422_g1_i2) and tauCD (taurine transporter subunit C and alpha-ketoglutarate-dependent taurine dioxygenase; c596422_g2_i1). These oxidising mechanisms, such as the well-known alkanesulfonate monooxygenase ssuD , suggest the bacterial species present in association with willow roots is actively expressing an enzyme suite capable of degrading the hydrocarbons present within contaminated soil and so may play an important role within any tripartite interaction of mutual benefit.
Genes necessary for three biosurfactants were found to be most represented in metagenomic study of crude oil contaminated soil , (in descending order of abundance): trehalose lipids , polyol lipids and mono/di-rhamnolipids . Interestingly, the principal enzymes involved in the trehalose degradation pathway were DE, with two putative operons, the first containing treAdhaLKMR (c602247_g2_i1), from E. coli and the second (c567422_g1_i2) containing treF (from shigella) alongside a transcriptional regulator yhjB (from E. coli). These potential operons comprise the periplasmic trehalase (treA) and PTS proteins (dhaLKM, although no transport activity is expected ) as well as the cytoplasmic trehalase (treF) expected for trehalase utilisation . While this suggests trehalose may not be being employed as a biosurfactant, it is compelling in terms of the potential interaction with ECM Basidiomycota as the specific contamination-responsive DE contigs include trehalose biosynthesis genes. The entire rmlABCD (often termed rfb) operon (c433320_g1_i1) was in higher abundance in contaminated conditions, comprising the l-rhamnose synthesis genes essential for rhamnolipid production in E. coli [150, 153]. Additionally to this, rhamnulose-1-phosphate aldolase (rhaD), rhamnulokinase l (rhaB) and rhamnose-isomerase (rhaA) were all upregulated allowing provision of l-rhamnose from dihydroxyacetone-phosphate , present within two operons comprising rhaDyiiL with genes for PTS system IIA and IIB components (c599913_g4_i1, operon structure common in Firmicutes) and rhaABSRyifCLMaidB (c433320_g1_i1). While the l-rhamnose synthesis pathway is considered well characterised, the exact 3-hydroxy fatty acid precursors of rhamnolipids are less confidently known, although it has been hypothesised that both the type II fatty acid synthesis pathway  and β-oxidation  can provide lipid precursors. An operon was upregulated (c601578_g3_i4) containing numerous members of the type II fatty acid synthesis pathway (FAS-II): fabG (syn: beta-ketoacyl-ACP reductase, similar to rhlG of rhamnolipid synthesis in Pseudomonas aeruginosa ) fabDFHychHyceDFGthiK, the operon structure being unsurprising in E. coli. rhlAB, recognisable as driving rhamnolipid production in Pseudomonas aeruginosa , was not detected however, and an operon containing fabA, potentially a competitor for β-hydroxydecanoyl-ACP , was DE. Of β-oxidation, two fadIJsixA upregulated operons (c591107_g2_i1 and c591107_g2_i2), a common operon if E. coli K-12, were in higher abundance under contaminated conditions. fadIJ are recently discovered homologues of β-oxidation fadAB in E. coli with a suggested preference for short and medium chain fatty acid degradation .
These DE potential operons encompass a wide array of genes considered cryptic in E. coli (silent under standard laboratory conditions), for example, cryptic genes in E. coli K-12 substr. MG1655  that were expressed here included ecpABCD (c589067_g3_i3), ebgAC (c601454_g4_i1), gspAB c602128_g2_i1), gspCD (c600465_g1_i1) and chiAgspO (c600465_g2_i2) (Fig. 7 and Additional file 10). The expression of genes which are silent in many E. coli species cultured under laboratory conditions is perhaps unsurprising given the mycorrhizal environment; this is further supported by DE of genes representing fungiphile metabolism  in the Enterobacteriaceae species responding to contamination conditions. The most convincing in this regard were genes involved in interaction with chitin (fungal cell wall) and the well-known fungal exudate oxalate, used for habitat manipulation through pH lowering and increasing availability of nutrients  as well as for lignocellulosic degradation in many Basidiomycota. The putative operon c591446_g1_i5 comprised the well-studied two-component (stimulus-response) regulatory system evgSA  with an Oxalyl-CoA decarboxylase, acetyl-CoA:oxalate CoA-transferase and transporter similar to the metabolic machinery essential for E. coli tolerance of oxalate . The putative operon c599707_g2_i2 comprised 8 proteins including chiPQ (oprD family, a chitoporin for uptake of chitosugars and chitosugar-induced lipoprotein which is not constitutively expressed [165, 166]) and a phosphoglymutase, suggesting tight interaction with the chitin phosphotransferase system (PTS) system . In line with this is the upregulation of a potential operon containing four proteins including chbBC (N,N′-diacetylchitobiose-specific enzyme IIB component and permease IIC component of the PTS system) alongside osmE and nadE (an uncharacterised osmotically induced lipoprotein and NH(3)-dependent NAD(+) synthase). Further supporting this, the operon chbACFGR (c599728_g5_i1) comprising the more classically recognisable chitobiose operon , with the exception that chbB is absent (expressed in the previous operon alongside a distinct chbC). The cryptic chitinase chiA (c600465_g2_i2) was also expressed in an operon including bfr and gspO, pertinent given the ECM fungal interaction suggested by gene expression (potentially explaining cryptic expression as the fungal environment is difficult to replicate in culture) and logical in light of associated  DE of (non-constitutively expressed) type II general secretion systems (gspAB, c602128_g2_i1 and gspCD, c600465_g1_i1 functionality is discussed below).
While the clear expression of chitin degrading mechanisms could suggest the interaction with ECM fungi is endocellular biotrophy or necrotrophy, the possibility of extracellular biotrophy cannot be discounted as any complex interaction could involve modification of, or macro-(biofilm)-adhesion with, the Hartig net involving chitinases. In line with this, Chittero et al.  report biotrophic interaction of Pseudomonas fluorescens and Bacillaceae through the use of chitinases with fruit bodies of the ECM Tuber borchii; while others found bacterial chitinase production did not inhibit ECM fungi [171, 172] and that when comparing rhizospheric and ectomycorrhizosphere bacteria, those capable of hydrolysing chitin were present only within the ectomycorrhizosphere .
There are five stages to biofilm formation which are generally accepted: (i) surface contact and reversible attachment, (ii) irreversible attachment, (iii) microcolony formation and early development of architecture, (iv) maturation and (v) dispersal. The accomplished mini-review by Van Houdt and Michiels  outlines the proposed surface determinants associated with each stage in E. coli. Flagella have been associated with the initial surface contact and reversible attachment as well as final dispersal and motility (although development and maturation stages are not dependent on flagella). Irreversible attachment coincides with production of type I fimbriae, extracellular polysaccharides and curli fimbriae. Development of biofilm architecture also involves curli and extracellular polysaccharides as well as colonic acid production and the self-recognising adhesion antigen 43 (also called flu). Maturation of the biofilm architecture also involves colonic acid production and curli as well as conjugative pili. Dispersal of bacterial cells back into the environment again involves flagella. Operons encoding proteins from each of these processes were DE, all being in higher abundance in contaminated conditions. The formation and adhesion of bacterial biofilms to fungal hyphae has been well characterised as facilitating the exchange of nutrients [28, 29] but has also been shown to allow the mobilisation of pollutant degrading bacteria through the environment for mutual benefit [175, 176].
As substantial numbers of genes involved in biofilm formation were DE, we compared gene expression here to the list generated by Tenorio et al. , who systematically investigated genes affecting biofilm formation in E. coli through overexpression. While the study was performed using LB media, we expect extensive overlap in machinery involved in potential biofilm formation within an ECM environment. Ten genes were identified as fundamentally altering biofilm abundance: ccmf; fdrA (syn yahF); flgFGIL, fliH (c601578_g1_i1); gspA; secY and wcaK (Fig. 7 and Additional file 10). All were DE here with the exception of secY and wcaK; however, other colonic acid production genes were upregulated (wbz wcaJLF) as well as extensive DE of the secYEG translocon (peptide export complex proteins) including, secA, secE and SecDFyajC. A total of 11 of the 35 genes whose overexpression altered biofilm architecture in LB media were DE with increased abundance in roots of contaminated soil, while 11 of the 27 genes causing filmentous morphology of the biofilm were also upregulated. The flagellin synthesis genes FliA and FliC, and flagellar biosynthesis master regulator FlhC, were identified as in high abundance in treated samples as well as fliEFGHIJKLMNOP, fliDST, flgKL and flgGHIL. Flagella can allow increased fitness when growing/forming biofilm on fungal hyphae , although E. coli are thought to require flagella only for the initial stages of biofilm formation . Expression of these genes has been shown to be positively regulated by polyadenylation ; this is particularly interesting given the suggestion that this mechanism could be specifically in place to allow the bacteria to adapt and survive in challenging environmental conditions (through PAPI regulation).
The chaperone-usher pathway is a delivery system for type I pili through the outer membrane. The periplasmic transport of proteins is essential for successful pili biogenesis and the chaperone-usher pathway utilises the SecYEG translocon to achieve this, also upregulated here in contaminated conditions. Concordantly, the fimDFGH operon (c600702_g6_i3, structurally common in E. coli K-12 strains), encoding the outer membrane usher protein for type 1 fimbrial synthesis and three minor component of F-type fimbriae, was upregulated in contaminated conditions, as was sfmCD (c575401_g1_i2; fimACI was not identified as DE). Interestingly, given the likely interaction with fungal cells, sfm is a chaperone-usher operon that is silent under laboratory conditions but was shown to express when the type 1 fimbriae complex is knocked out and could promote adhesion to eukaryotic cells . Exploring the idea that a substantial amount of biofilm-related expression within this microbiome environment could indeed be cryptic within laboratory environments, Korea et al.  identified a wide range of these chaperone-usher fimbrae as associated with distinct surface specialties in E. coli K-12.
The contigs assembled here from root samples and upregulated in contaminated conditions did indeed represent the majority of these previously identified cryptic chaperone-usher fimbriae, including yfc (c601154_g1_i3; yfcRQOU(usher papC)SP(chaperone papD)B(prmB)aroC), sfm (c575401_g1_i2; sfmCD); fim (c600702_g6_i3; fimDFGH), yad (c601794_g1_i1; yadHG, c601794_g2_i3; yadCKLMhtrEecpD and c601794_g3_i1; yadECDI), yde (c591908_g2_i3; ydeTQSR), yqi (c601683_g14_i1; yqiGHI), yeh (c598114_g3_i1; yehABC) and yra (c600252_g5_i1; yraJ) (Fig. 7 and Additional file 10). These represent putative operons, predicted in E. coli K-12, which do not contribute to adhesion of E. coli under normal laboratory conditions but do promote biofilm formation on uncommon abiotic surfaces (‘unknown environmental niches’ ) and do contribute to adhesion to eukaryotic surfaces. The ELF operon (elfADCG)  was observed within a single contig as the seven member elfADCG-ycbUVF operon (c601320_g2_i5) predicted in E. coli K-12 with an additional protein, pyrD dihydroorotate dehydrogenase gene (quinone). Interestingly, a number of proteins expressed within these contigs are suggestive of potential functionality of association with a host: ydeTQSR included a serine/threonine-protein kinase HipA and the toxin component of a HipA family toxin/antitoxin system, and pantothenate synthetase, 3-methyl-2-oxobutanoate hydroxymethyltransferase and aspartate 1-decarboxylase (both of which participate in pantothenate and coA biosynthesis) were expressed along with yadECDI.
In addition to these extra-laboratory expressed operons associated with biofilm formation, a number of DE genes were associated with swarming in Pseudomonas aeruginosa, a process strongly related to biofilm formation. Four of the six genes explored as swarming negative mutants by Overhage et al.  were necessary for good biofilm formation; all were DE here and in higher abundance in contaminated conditions: rhlE (c601160_g1_i1, ATP-dependent RNA helicase), lptA (c602252_g5_i2, lipopolysaccharide transport periplasmic protein), gshB (c586692_g1_i1, glutathione synthetase) and acsA (c600318_g2_i1, acetyl-CoA synthetase). Additionally, two variants of traABCEKLNPUVWYtrbCL (c582511_g1_i6 and c582511_g1_i7) are comprised of conjugative apparatus (pili construction) common to F-like plasmids and necessary for mobility of genetic elements between bacteria (Fig. 7 and Additional file 10). None of the bacterial secretion systems are thought to be constitutively expressed but are instead triggered by recognition of host molecular pattern through adhesins [181, 182]. As discussed with relation to chitinase secretion, gspAB and gspCDO (with chiA) of the type II secretion system were upregulated in contaminated conditions. Given this pattern, it would be expected that xcpU (the general secretion pathway protein H, gspH in E. coli) would be DE; however, it was not identified as such.
The common pilus (ECP) operon is widely conserved throughout E. coli and required for early-stage biofilm development and host cell recognition , although usually silent in E. coli K-12 MG1655 under laboratory conditions . As discussed above, an operon containing the regular common pilus machinery, ecpRABCDEykgJyagU (c589067_g3_i3; synonyms include but are not limited to matABCDEFykgJyagU) was in higher abundance under contamination conditions. Interestingly, the yagTSRQ operon (c589067_g1_i4; synonym paoABCD), common in E. coli K-12 strains, was also DE, encoding periplasmic detoxification enzymes (xanthine oxidase family ) including yagT, a twin arginine translocator (tat) pathway signal sequence. While the secYEG translocon can transport unfolded proteins, the tat can transport folded proteins to the periplasm (around 6% of all secreted proteins are thought to be tat dependent ). These secreted proteins include hydrogenases, dehydrogenase and nitrate reductases. The tat-dependent hydrogenase-2 operon (redox) (hybGFEDCBAO; c600034_g1_i12) was identified as DE, including the Tat pathway signal sequence domain protein hybA .
Curli fimbriae are outer membrane adhesins employed by E. coli to facilitate biofilm adhesion [177, 188, 189]. A master regulator of biofilm formation, csgD , is at the centre of the curli fimbriae regulatory network. Two variants of the csgD containing operon csgDEFGycdXYZ (c596842_g1_i2 and c596842_g1_i4) were DE and in higher abundance in contaminated roots (Fig. 7 and Additional file 10). csgDEFG encode four proteins necessary for curli assembly: csgG is involved in pore formation in the outer membrane, whereas csgE and csgF are periplasmic proteins which physically interact with csgG . csgD is a positive regulator of csgAB (classically expressed on a distinct operon), the major curli extracellular structural subunit (csgA) and its nucleator (csgB). While csgAB was not identified as DE here, it has been recognised that this is often the case (csgAB not being expressed in cells forming curli) and that assembly can occur via interbacterial complementation, it may also be true here that subunit production is not transcript dependent or is constitutive and constant (and so not DE). Additionally to curli, the prominent surface protein flu was also upregulated (c601850_g2_i6), a self-recognising adhesin known to induce autoaggregation to promote biofilm formation .
Similar to the recently sequenced plant growth-promoting endophyte Enterobacter sp. 638, which was isolated from poplar stems, it seems likely that no nitrogen fixation was occurring within the Enterobacteriaceae species responding to contamination as no nif genes were identified as DE, with the exception of the pyruvate-flavodoxin oxidoreductase (nifJ/ydbK) (Fig. 7 and Additional file 10). Two variants of the fixABCX operon were also upregulated (c602208_g1_i1 and c602208_g1_i3; intricately linked with N fixation in diazotrophs [191, 192]), but in E. coli, these genes form an important reducing role in carnitine metabolism [193, 194].
Differential expression of well-characterised respiratory nitrate reduction (nar) operons were identified including narK (c598952_g5_i3) used for nitrate/nitrite extrusion in E. coli ; narUZ (c598132_g7_i1) where narU is also used for nitrate/nitrite extrusion while narZ is part of nitrate reduction, as well as narXLychONChaC (c598952_g3_i7; similar to 282397 (TU0650)), nitrate/nitrite sensor and regulator proteins. Two slightly divergent versions of the periplasmic nitrate reduction (nap) operon was also DE, napABCGHccmABCFE (c602069_g2_i2 and c602069_g2_i4; similar to 425132 (TU1418)) . The important nitric oxide reduction (dissimilatory reduction) equipment in E. coli, via flavorubredoxin expression encoded by the NorVW operon , was also DE. The nitrogen assimilation control regulator nac was also present within an upregulated operon ((c598923_g2_i7; common to in structure to 580938 in E. coli BL21(DE3)) containing the cbl regulator, involved in aliphatic sulfonate utilisation, and yeeO MATE efflux transporter (involved in flavin secretion). In terms of ammonia generation, two versions of anaerobically expressed nitrite transport and reduction nir operons [31, 198] were DE: nirBtsgA (c598930_g2_i1) and nirBCDcysG (c598930_g1_i1). Amino acid (glutamine) synthesis under nitrogen demand has also been shown to be fine-tuned (ammonia/ammonium) by the nitrogen assimilatory protein glnK regulation of amtB (ammonium transporter) , present here within an upregulated operon containing the amtB and two multidrug efflux proteins, mdlAB, (c602197_g2_i5; all commonly expressed together). Additionally to this, nrfA, a cytochrome c-552 which catalyses nitrite to ammonia in a formate-dependent manner, was upregulated (c600318_g2_i1) as well as an operon containing yddG, an aromatic amino acid extrusion protein, in a operon (c601567_g3_i3) containing the nitrate-inducible formate dehydrogenase major subunit fdnG.
The conversion of inorganic nitrogen, through nitrate assimilation and reduction, to the biologically more useful nitrite and ammonia (for example, to construct amino acids) provides bacteria with nutrition for survival within the environment but also a potential currency for mutualism within the microbiome, or mycorrhizosphere, system. E. coli only assimilates nitrate under anaerobic conditions ; this broad spectrum of assimilatory and dissimilatory nitrate reduction equipment with higher abundance under contaminated conditions is perhaps unsurprising given that facultative anaerobic Enterobacteriaceae can outcompete obligate anaerobes in anoxic, organic compound-rich forest soils .
Bacterial functional role
So-called mycorrhizal helper bacteria MHB  are thought to promote the symbiotic association between plant roots and mycorrhizae, creating a tripartite relationship , such as the Bacillus sphaericus isolate EJP109 promoting Suillus luteus (ECM) growth in association with Pinus sylvestris as well as the Streptomyces sp. nov. 505 and S. anulatus (Beijerinck) Waksman 1003 promoting Amanita muscaria (ECM) growth in association with Picea abies [172, 202]. MHB and PGPR (plant growth-promoting rhizobacteria) strains of Pseudomonas have been visualised attaching to ECM hyphae (suggesting extracellular biotrophic mycophagy) of the genus Laccaria, in a bacterial and fungal strain-specific manner . Differentiating between the potential roles of interacting bacterial species within the biological system is difficult; Leveau and Preston  suggested three potential modes of interaction, necrotrophy, extracellular biotrophy and endocellular biotrophy, as well as a spectrum of expected functionality associated to each mode. In general here, expression suggested extracellular biotrophy, however, not only are the other two modes of interaction possible but a more complex continuum of interaction would seem likely given the diverse environment. Before such functionality can be confidently elucidated, in addition to the necessary promotion of cross-disciplinary microbiome research, observation of gene expression without constraint to an organism/s expected within a biological system needs to become a standard requirement for transcriptomic investigations.
From gene expression alone, the major responses of Salix purpurea cv. ‘Fish Creek’ to soil contamination do not seem to be direct degradation, immobilisation or exclusion of contaminants but rather widespread, extraordinarily complex, alterations to interactions with microbiota. The extremely diverse fungal community, revealed using a bioinformatics approach unconstrained by the requirement of a priori nucleotide sequence, responds dynamically to contamination both with changes to gene expression but also with substantial shifts in the community makeup. Much like the host plant, fungal gene expression was dominated by alterations in pathways associated to microbiome interactions as opposed to direct hydrocarbon degradation. Surprisingly, gene expression representing increased petroleum hydrocarbon metabolism equipment came from an Enterobacteriaceae species whose total expression was uniformly more abundant in roots of contaminated trees. It seems possible that, given the successful contamination tolerance, willows may depend on symbiosis to tolerate stressful environmental conditions instead of relying solely on their own metabolism.
While polyadenylated bacterial sequences are often not quantified across experimental systems investigated with eukaryote expression in mind, the bacterial functionality observed here is thought-provoking, revealing a plausible set of expressed genes indicating tripartite mutualism. A finding which is useful in providing a convincing reminder that observing the entirety of generated data, even that which is commonly discarded, is always of potential value for exposing the unexpected. Most importantly, with regard to the attempted all-inclusive strategy of observing all expressed sequences within a biological sample, regardless of origin, is how extensively interpretation of expression could be confounded if plant, fungal or bacterial expression were investigated as responsive to treatment (here being contamination) in isolation, and thus how crucial it is to attempt observation of the entire microbiome and wider metaorganism. This metatranscriptomic approach, which marries an understanding of the uncertainty of microbiome biology with the strengths and limitations of current bioinformatics, should be broadly transferable through utility for biological samples containing expressed sequences from well-characterised, poorly characterised or entirely unknown organisms at a diversity of low or high complexity.
Stappenbeck TS, Virgin HW. Accounting for reciprocal host–microbiome interactions in experimental science. Nature. 2016;534:191–9.
Brereton NJB, Gonzalez E, Marleau J, Nissim WG, Labrecque M, Joly S, Pitre FE. Comparative transcriptomic approaches exploring contamination stress tolerance in Salix sp. reveal the importance for a metaorganismal de novo assembly approach for nonmodel plants. Plant Physiol. 2016;171:3–24.
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.
Panagos P, Van Liedekerke M, Yigini Y, Montanarella L. Contaminated sites in Europe: review of the current situation based on data collected through a European network. J Environ Public Health. 2013;2013:158764.
Bell TH, El-Din Hassan S, Lauron-Moreau A, Al-Otaibi F, Hijri M, Yergeau E, St-Arnaud M. Linkage between bacterial and fungal rhizosphere communities in hydrocarbon-contaminated soils is related to plant phylogeny. ISME J. 2014;8:331–43.
Bissonnette L, St-Arnaud M, Labrecque M. Phytoextraction of heavy metals by two Salicaceae clones in symbiosis with arbuscular mycorrhizal fungi during the second year of a field trial. Plant Soil. 2010;332:55–67.
Hassan SED, Bell TH, Stefani FOP, Denis D, Hijri M, St-Arnaud M. Contrasting the community structure of arbuscular mycorrhizal fungi from hydrocarbon-contaminated and uncontaminated soils following willow (Salix spp. L.) planting. PLoS One. 2014;9:e102838.
Almeida-Rodríguez AM, Gómes MP, Loubert-Hudon A, Joly S, Labrecque M. Symbiotic association between Salix purpurea L. and Rhizophagus irregularis: modulation of plant responses under copper stress. Tree Physiol. 2015;36(4):407–20. tpv119
Grigoriev IV, Nikitin R, Haridas S, Kuo A, Ohm R, Otillar R, Riley R, Salamov A, Zhao X, Korzeniewski F. MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res. 2014;42(Database issue):D699–704.
Abbasian F, Lockington R, Palanisami T, Ramadass K, Megharaj M, Naidu R. Microbial diversity and hydrocarbon degrading gene capacity of a crude oil field soil as determined by metagenomics analysis. Biotechnol Prog. 2016;32(3):638–48.
Taktek S, St-Arnaud M, Piché Y, Fortin JA, Antoun H. Igneous phosphate rock solubilization by biofilm-forming mycorrhizobacteria and hyphobacteria associated with Rhizoglomus irregulare DAOM 197198. Mycorrhiza. 2017;27(1):13–22.
Villacorta-Martin C, Núñez de Cáceres González FF, de Haan J, Huijben K, Passarinho P, Lugassi-Ben Hamo M, Zaccai M. Whole transcriptome profiling of the vernalization process in Lilium longiflorum (cultivar White Heaven) bulbs. BMC Genomics. 2015;16:550.
Leng N, Dawson JA, Thomson JA, Ruotti V, Rissman AI, Smits BMG, Haag JD, Gould MN, Stewart RM, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics. 2013;29:1035–43.
Hultman J, Waldrop MP, Mackelprang R, David MM, McFarland J, Blazewicz SJ, Harden J, Turetsky MR, McGuire AD, Shah MB. Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes. Nature. 2015;521:208–12.
Lindner DL, Banik MT. Intragenomic variation in the ITS rDNA region obscures phylogenetic relationships and inflates estimates of operational taxonomic units in genus Laetiporus. Mycologia. 2011;103:731–40.
Guo LL, Sui ZH, Zhang S, Ren YY, Liu Y. Comparison of potential diatom ‘barcode’ genes (the 18S rRNA gene and ITS, COI, rbcL) and their effectiveness in discriminating and determining species taxonomy in the Bacillariophyta. Int J Syst Evol Microbiol. 2015;65:1369–80.
Maslunka C, Gifford B, Tucci J, Gurtler V, Seviour RJ. Insertions or deletions (Indels) in the rrn 16S-23S rRNA gene internal transcribed spacer region (ITS) compromise the typing and identification of strains within the Acinetobacter calcoaceticus-baumannii (Acb) complex and closely related members. PLoS One. 2014;9(8):e105390. https://doi.org/10.1371/journal.pone.0105390.
Porras-Alfaro A, Liu KL, Kuske CR, Xie G. From genus to phylum: large-subunit and internal transcribed spacer rRNA operon regions show similar classification accuracies influenced by database composition. Appl Environ Microbiol. 2014;80:829–40.
Iffis B, St-Arnaud M, Hijri M. Petroleum hydrocarbon contamination, plant identity and arbuscular mycorrhizal fungal (AMF) community determine assemblages of the AMF spore-associated microbes. Environ Microbiol. 2016;18:2689–704.
Yergeau E, Bell TH, Champagne J, Maynard C, Tardif S, Tremblay J, Greer CW. Transplanting soil microbiomes leads to lasting effects on willow growth, but not on the rhizosphere microbiome. Front Microbiol. 2015;6:1436.
Brereton NJB, Ahmed F, Sykes D, Ray MJ, Shield I, Karp A, Murphy RJ. X-ray micro-computed tomography in willow reveals tissue patterning of reaction wood and delay in programmed cell death. BMC Plant Biol. 2015;15:83.
Noguchi A, Saito A, Homma Y, Nakao M, Sasaki N, Nishino T, Takahashi S, Nakayama T. A UDP-glucose: isoflavone 7-O-glucosyltransferase from the roots of soybean (glycine max) seedlings purification, gene cloning, phylogenetics, and an implication for an alternative strategy of enzyme catalysis. J Biol Chem. 2007;282:23581–90.
Fischer WN, Loo DD, Ludewig U, Boorer KJ, Tegeder M, Rentsch D, Wright EM, Frommer WB. Low and high affinity amino acid H+-cotransporters for cellular import of neutral and charged amino acids. Plant J. 2002;29:717–31.
Bago B, Pfeffer PE, Douds DD, Brouillette J, Bécard G, Shachar-Hill Y. Carbon metabolism in spores of the arbuscular mycorrhizal fungusGlomus intraradices as revealed by nuclear magnetic resonance spectroscopy. Plant Physiol. 1999;121:263–72.
Floß DS, Hause B, Lange PR, Kuester H, Strack D, Walter MH. Knock-down of the MEP pathway isogene 1-deoxy-D-xylulose 5-phosphate synthase 2 inhibits formation of arbuscular mycorrhiza-induced apocarotenoids, and abolishes normal expression of mycorrhiza-specific plant marker genes. Plant J. 2008;56:86–100.
Salzer P, Hubner B, Sirrenberg A, Hager A. Differential effect of purified spruce chitinases and [beta]-1, 3-glucanases on the activity of elicitors from ectomycorrhizal fungi. Plant Physiol. 1997;114:957–68.
Albrecht C, Burgess T, Dell B, Lapeyrie F. Chitinase and peroxidase activities are induced in eucalyptus roots according to aggressiveness of Australian ectomycorrhizal strains of Pisolithus sp. New Phytol. 1994;127:217–22.
Stefani FO, Tanguay P, Pelletier G, Piché Y, Hamelin RC. Impact of endochitinase-transformed white spruce on soil fungal biomass and ectendomycorrhizal symbiosis. Appl Environ Microbiol. 2010;76:2607–14.
Le Quéré A, Wright DP, Söderström B, Tunlid A, Johansson T. Global patterns of gene regulation associated with the development of ectomycorrhiza between birch (Betula pendula Roth.) and Paxillus involutus (Batsch) Fr. Mol Plant-Microbe Interact. 2005;18:659–73.
Plett JM, Daguerre Y, Wittulsky S, Vayssières A, Deveau A, Melton SJ, Kohler A, Morrell-Falvey JL, Brun A, Veneault-Fourrey C. Effector MiSSP7 of the mutualistic fungus Laccaria bicolor stabilizes the Populus JAZ6 protein and represses jasmonic acid (JA) responsive genes. Proc Natl Acad Sci. 2014;111:8299–304.
Leonardi M, Iotti M, Oddis M, Lalli G, Pacioni G, Leonardi P, Maccherini S, Perini C, Salerni E, Zambonelli A. Assessment of ectomycorrhizal fungal communities in the natural habitats of Tuber magnatum (Ascomycota, Pezizales). Mycorrhiza. 2013;23:349–58.
Sundaram S, Kim S, Suzuki H, Mcquattie C, Hiremath S, Podila G. Isolation and characterization of a symbiosis-regulated ras from the ectomycorrhizal fungus Laccaria bicolor. Mol Plant-Microbe Interact. 2001;14:618–28.
Matheny PB, Curtis JM, Hofstetter V, Aime MC, Moncalvo J-M, Ge Z-W, Yang Z-L, Slot JC, Ammirati JF, Baroni TJ. Major clades of Agaricales: a multilocus phylogenetic overview. Mycologia. 2006;98:982–95.
Kohler A, Kuo A, Nagy LG, Morin E, Barry KW, Buscot F, Canbäck B, Choi C, Cichocki N, Clum A. Convergent losses of decay mechanisms and rapid turnover of symbiosis genes in mycorrhizal mutualists. Nat Genet. 2015;47:410–5.
Sipilä TP, Keskinen A-K, Åkerman M-L, Fortelius C, Haahtela K, Yrjälä K. High aromatic ring-cleavage diversity in birch rhizosphere: PAH treatment-specific changes of IE 3 group extradiol dioxygenases and 16S rRNA bacterial communities in soil. ISME j. 2008;2:968–81.
Carmel-Harel O, Storz G. Roles of the glutathione-and thioredoxin-dependent reduction systems in the Escherichia coli and Saccharomyces cerevisiae responses to oxidative stress. Annu Rev Microbiol. 2000;54:439–61.
Nehls U, Wiese J, Guttenberger M, Hampp R. Carbon allocation in ectomycorrhizas: identification and expression analysis of an Amanita muscaria monosaccharide transporter. Mol Plant-Microbe Interact. 1998;11:167–76.
Fajardo López M, Dietz S, Grunze N, Bloschies J, Weiß M, Nehls U. The sugar porter gene family of Laccaria bicolor: function in ectomycorrhizal symbiosis and soil-growing hyphae. New Phytol. 2008;180:365–78.
Deshmukh S, Hückelhoven R, Schäfer P, Imani J, Sharma M, Weiss M, Waller F, Kogel K-H. The root endophytic fungus Piriformospora indica requires host cell death for proliferation during mutualistic symbiosis with barley. Proc Natl Acad Sci. 2006;103:18450–7.
Zuccaro A, Lahrmann U, Güldener U, Langen G, Pfiffi S, Biedenkopf D, Wong P, Samans B, Grimm C, Basiewicz M. Endophytic life strategies decoded by genome and transcriptome analyses of the mutualistic root symbiont Piriformospora indica. PLoS Pathog. 2011;7:e1002290.
Roy A, Hashmi S, Li Z, Dement AD, Cho KH, Kim J-H. The glucose metabolite methylglyoxal inhibits expression of the glucose transporter genes by inactivating the cell surface glucose sensors Rgt2 and Snf3 in yeast. Mol Biol Cell. 2016;27:862–71.
Martin F, Aerts A, Ahrén D, Brun A, Danchin E, Duchaussoy F, Gibon J, Kohler A, Lindquist E, Pereda V. The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. Nature. 2008;452:88–92.
Sillo F, Fangel JU, Henrissat B, Faccio A, Bonfante P, Martin F, Willats WG, Balestrini R. Understanding plant cell-wall remodelling during the symbiotic interaction between Tuber melanosporum and Corylus avellana using a carbohydrate microarray. Planta. 2016;244(2):347–59.
Veneault-Fourrey C, Kohler A, Morin E, Balestrini R, Plett J, Danchin E, Coutinho P, Wiebenga A, De Vries RP, Henrissat B. Genomic and transcriptomic analysis of Laccaria bicolor CAZome reveals insights into polysaccharides remodelling during symbiosis establishment. Fungal Genet Biol. 2014;72:168–81.
Peter M, Kohler A, Ohm RA, Kuo A, Krützmann J, Morin E, Arend M, Barry KW, Binder M, Choi C, et al. Ectomycorrhizal ecology is imprinted in the genome of the dominant symbiotic fungus Cenococcum geophilum. Nat Commun. 2016;7:12662.
Donaldson LA, Knox JP. Localization of cell wall polysaccharides in normal and compression wood of radiata pine: relationships with lignification and microfibril orientation. Plant Physiol. 2012;158:642–53.
Massicotte H, Ackerley C, Peterson R. Localization of three sugar residues in the interface of ectomycorrhizae synthesized between Alnus crispa and Alpova diplophloeus as demonstrated by lectin binding. Can J Bot. 1987;65:1127–32.
Giollant M, Guillot J, Damez M, Dusser M, Didier P, Didier E. Characterization of a lectin from Lactarius deterrimus (research on the possible involvement of the fungal lectin in recognition between mushroom and spruce during the early stages of mycorrhizae formation). Plant Physiol. 1993;101:513–22.
Ceccaroli P, Buffalini M, Saltarelli R, Barbieri E, Polidori E, Ottonello S, Kohler A, Tisserant E, Martin F, Stocchi V. Genomic profiling of carbohydrate metabolism in the ectomycorrhizal fungus Tuber melanosporum. New Phytol. 2011;189:751–64.
Shah F, Rineau F, Canbäck B, Johansson T, Tunlid A. The molecular components of the extracellular protein-degradation pathways of the ectomycorrhizal fungus Paxillus involutus. New Phytol. 2013;200:875–87.
Sundaram S, Brand JH, Hymes MJ, Hiremath S, Podila GK. Isolation and analysis of a symbiosis-regulated and Ras-interacting vesicular assembly protein gene from the ectomycorrhizal fungus Laccaria bicolor. New Phytol. 2004;161:529–38.
Rajashekar B, Kohler A, Johansson T, Martin F, Tunlid A, Ahrén D. Expansion of signal pathways in the ectomycorrhizal fungus Laccaria bicolor–evolution of nucleotide sequences and expression patterns in families of protein kinases and RAS small GTPases. New Phytol. 2009;183:365–79.
de Freitas PM, Betancourth BML, Teixeira JA, Zubieta MP, de Queiroz MV, Kasuya MCM, Costa MD, de Araújo EF. In vitro Scleroderma laeve and Eucalyptus grandis mycorrhization and analysis of atp6, 17S rDNA, and ras gene expression during ectomycorrhizal formation. J Basic Microbiol. 2014;54:1358–66.
Maes A, Gracia C, Brechemier D, Hamman P, Chatre E, Lemelle L, Bertin PN, Hajnsdorf E. Role of polyadenylation in regulation of the flagella cascade and motility in Escherichia coli. Biochimie. 2013;95:410–8.
Li H, Su H, Kim SB, Chang YK, Hong S-K, Seo Y-G, Kim C-J. Enhanced production of trehalose in Escherichia coli by homologous expression of otsBA in the presence of the trehalase inhibitor, validamycin A, at high osmolarity. J Biosci Bioeng. 2012;113:224–32.
Cabrera-Valladares N, Richardson A-P, Olvera C, Treviño LG, Déziel E, Lépine F, Soberón-Chávez G. Monorhamnolipids and 3-(3-hydroxyalkanoyloxy) alkanoic acids (HAAs) production using Escherichia coli as a heterologous host. Appl Microbiol Biotechnol. 2006;73:187–94.
Henkel M, Müller MM, Kügler JH, Lovaglio RB, Contiero J, Syldatk C, Hausmann R. Rhamnolipids as biosurfactants from renewable resources: concepts for next-generation rhamnolipid production. Process Biochem. 2012;47:1207–19.
Zhu K, Rock CO. RhlA converts beta-hydroxyacyl-acyl carrier protein intermediates in fatty acid synthesis to the beta-hydroxydecanoyl-beta-hydroxydecanoate component of rhamnolipids in Pseudomonas aeruginosa. J Bacteriol. 2008;190:3147–54.
Campos-Garcia J, Caro AD, Najera R, Miller-Maier RM, Al-Tahhan RA, Soberon-Chavez G. The Pseudomonas aeruginosa rhlG gene encodes an NADPH-dependent beta-ketoacyl reductase which is specifically involved in rhamnolipid synthesis. J Bacteriol. 1998;180:4442–51.
Fontenot EM, Ezelle KE, Gabreski LN, Giglio ER, McAfee JM, Mills AC, Qureshi MN, Salmon KM, Toyota CG. YfdW and YfdU are required for oxalate-induced acid tolerance in Escherichia coli K-12. J Bacteriol. 2013;195:1446–55.
Takanao S, Honma S, Miura T, Ogawa C, Sugimoto H, Suzuki K, Watanabe T. Construction and basic characterization of deletion mutants of the genes involved in chitin utilization by Serratia marcescens 2170. Biosci Biotechnol Biochem. 2014;78:524–32.
Toratani T, Shoji T, Ikehara T, Suzuki K, Watanabe T. The importance of chitobiase and N-acetylglucosamine (GlcNAc) uptake in N, N′-diacetylchitobiose [(GlcNAc) 2] utilization by Serratia marcescens 2170. Microbiology. 2008;154:1326–32.
Citterio B, Malatesta M, Battistelli S, Marcheggiani F, Baffone W, Saltarelli R, Stocchi V, Gazzanelli G. Possible involvement of Pseudomonas fluorescens and Bacillaceae in structural modifications of Tuber borchii fruit bodies. Can J Microbiol. 2001;47:264–8.
Bending GD, Poole EJ, Whipps JM, Read DJ. Characterisation of bacteria from Pinus sylvestris-Suillus luteus mycorrhizas and their effects on root-fungus interactions and plant growth. FEMS Microbiol Ecol. 2002;39:219–27.
Uroz S, Courty PE, Pierrat JC, Peter M, Buee M, Turpault MP, Garbaye J, Frey-Klett P. Functional profiling and distribution of the forest soil bacterial communities along the soil mycorrhizosphere continuum. Microb Ecol. 2013;66:404–15.
Sarand I, Timonen S, Nurmiaho-Lassila E-L, Koivula T, Haahtela K, Romantschuk M, Sen R. Microbial biofilms and catabolic plasmid harbouring degradative fluorescent pseudomonads in Scots pine mycorrhizospheres developed on petroleum contaminated soil. FEMS Microbiol Ecol. 1998;27:115–26.
Costa TR, Felisberto-Rodrigues C, Meir A, Prevost MS, Redzej A, Trokter M, Waksman G. Secretion systems in Gram-negative bacteria: structural and mechanistic insights. Nat Rev Microbiol. 2015;13:343–59.
Garnett JA, Martínez-Santos VI, Saldaña Z, Pape T, Hawthorne W, Chan J, Simpson PJ, Cota E, Puente JL, Girón JA. Structural insights into the biogenesis and biofilm formation by the Escherichia coli common pilus. Proc Natl Acad Sci. 2012;109:3950–5.
Lehti TA, Bauchart P, Heikkinen J, Hacker J, Korhonen TK, Dobrindt U, Westerlund-Wikström B. Mat fimbriae promote biofilm formation by meningitis-associated Escherichia coli. Microbiology. 2010;156:2408–17.
Otrelo-Cardoso AR, da Silva Correia MA, Schwuchow V, Svergun DI, Romão MJ, Leimkühler S, Santos-Silva T. Structural data on the periplasmic aldehyde oxidoreductase PaoABC from Escherichia coli: SAXS and preliminary X-ray crystallography analysis. Int J Mol Sci. 2014;15:2223–36.
Harborne NR, Griffiths L, Busby SJ, Cole JA. Transcriptional control, translation and function of the products of the five open reading frames of the Escherichia coli nir operon. Mol Microbiol. 1992;6:2805–13.
Boogerd FC, Ma H, Bruggeman FJ, van Heeswijk WC, García-Contreras R, Molenaar D, Krab K, Westerhoff HV. AmtB-mediated NH3 transport in prokaryotes must be active and as a consequence regulation of transport by GlnK is mandatory to limit futile cycling of NH 4+/NH 3. FEBS Lett. 2011;585:23–8.
Schrey SD, Schellhammer M, Ecke M, Hampp R, Tarkka MT. Mycorrhiza helper bacterium Streptomyces AcH 505 induces differential gene expression in the ectomycorrhizal fungus Amanita muscaria. New Phytol. 2005;168:205–16.
Sen R, Nurmiaho-Lassila E, Haahtela K, Korhonen K. Specificity and mode of primary attachment of Pseudomonas fluorescens strains to the cell walls of ectomycorrhizal fungi. Mycorrhizas in integrated systems: from genes to plant development ECSC-EC-EAEC. Brussels: ECSC-EC-EAEC Press; 1996. p. 661–4.
Leveau JH, Preston GM. Bacterial mycophagy: definition and diagnosis of a unique bacterial-fungal interaction. New Phytol. 2008;177:859–76.
Lehembre F, Doillon D, David E, Perrotto S, Baude J, Foulon J, Harfouche L, Vallon L, Poulain J, Da Silva C, et al. Soil metatranscriptomics for mining eukaryotic heavy metal resistance genes. Environ Microbiol. 2013;15:2829–40.
Brosche M, Vinocur B, Alatalo ER, Lamminmaki A, Teichmann T, Ottow EA, Djilianov D, Afif D, Bogeat-Triboulot MB, Altman A, et al. Gene expression and metabolite profiling of Populus euphratica growing in the Negev desert. Genome Biol. 2005;6(12):R101.
Nanjo T, Sakurai T, Totoki Y, Toyoda A, Nishiguchi M, Kado T, Igasaki T, Futamura N, Seki M, Sakaki Y, et al. Functional annotation of 19,841 Populus nigra full-length enriched cDNA clones. BMC Genomics. 2007;8:448.
Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Charbonnel-Campaa L, Lindvall JJ, Tandre K, Strauss SH, et al. A Populus EST resource for plant functional genomics. Proc Natl Acad Sci U S A. 2004;101:13951–6.
Ralph S, Oddy C, Cooper D, Yueh H, Jancsik S, Kolosova N, Philippe RN, Aeschliman D, White R, Huber D. Genomics of hybrid poplar (Populus trichocarpa× deltoides) interacting with forest tent caterpillars (Malacosoma disstria): normalized and full-length cDNA libraries, expressed sequence tags, and a cDNA microarray for the study of insect-induced defences in poplar. Mol Ecol. 2006;15:1275–97.
Varshney RK, Hiremath PJ, Lekha P, Kashiwagi J, Balaji J, Deokar AA, Vadez V, Xiao YL, Srinivasan R, Gaur PM, et al. A comprehensive resource of drought- and salinity-responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.). BMC Genomics. 2009;10:523.
Gupta V, Raghuvanshi S, Gupta A, Saini N, Gaur A, Khan MS, Gupta RS, Singh J, Duttamajumder SK, Srivastava S, et al. The water-deficit stress- and red-rot-related genes in sugarcane. Funct Integr Genomics. 2010;10:207–14.
Lambilliotte R, Cooke R, Samson D, Fizames C, Gaymard F, Plassard C, Tatry MV, Berger C, Laudié M, Legeai F. Large-scale identification of genes in the fungus Hebeloma cylindrosporum paves the way to molecular analyses of ectomycorrhizal symbiosis. New Phytol. 2004;164:505–13.
Karim N, Shibuya H, Kikuchi T. Analysis of expressed sequence tags from the wood-decaying fungus Fomitopsis palustris and identification of potential genes involved in the decay process. J Microbiol Biotechnol. 2011;21:347–58.
Spribille T, Tuovinen V, Resl P, Vanderpool D, Wolinski H, Aime MC, Schneider K, Stabentheiner E, Toome-Heller M, Thor G. Basidiomycete yeasts in the cortex of ascomycete macrolichens. Science. 2016;353:488–92.
We thank the Genome Quebec Innovation Centre for the support and Calcul Quebec for the computing resources.
The project was funded by the GenoRem Project (Genome Canada and Genome Québec), NSERC Strategic grant for projects (STPGP 494702) as well as BioFuelNet Canada and NCE (Networks of 1583 Center of Excellence).
Availability of data and materials
Sequence data from this article can be found in the European Nucleotide Archive online repository (PRJEB16316).
Authors and Affiliations
Canadian Center for Computational Genomics, McGill University and Genome Quebec Innovation Center, Montréal, H3A 1A4, Canada
Department of Human Genetics, McGill University, Montreal, H3A 1B1, Canada
Institut de recherche en biologie végétale, University of Montreal, Montreal, QC, H1X 2B2, Canada
F. E. Pitre, J. Marleau, M. St-Arnaud, M. Labrecque, S. Joly & N. J. B. Brereton
Montreal Botanical Garden, Montreal, QC, H1X 2B2, Canada
F. E. Pitre, M. St-Arnaud, M. Labrecque & S. Joly
Aquatic and Crop Resource Development (ACRD), National Research Council Canada, Montréal, QC, H4P 2R2, Canada
A. P. Pagé
Department of Agri-food and Environmental Science, University of Florence, Viale delle Idee, Sesto Fiorentino, FI, Italy
W. Guidi Nissim
Institut National de la Recherche Scientifique, Centre INRS–Institut Armand-Frappier, Laval, QC, Canada
EY, MSA, SJ, ML and FP conceived and designed the study. JM, WGN and AP performed the plant growth trials, sample and sequencing preparation. EG and NJBB analysed the data and drafted the manuscript. All authors commented on and approved the final manuscript.
Bacteria DE transdecoded (annotation of polycistronic contigs) spreadsheet. (XLSX 2104 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Gonzalez, E., Pitre, F.E., Pagé, A.P. et al. Trees, fungi and bacteria: tripartite metatranscriptomics of a root microbiome responding to soil contamination.
Microbiome6, 53 (2018). https://doi.org/10.1186/s40168-018-0432-5