A viability-linked metagenomic analysis of cleanroom environments: eukarya, prokaryotes, and viruses
- Thomas Weinmaier†1,
- Alexander J. Probst†2,
- Myron T. La Duc3, 4,
- Doina Ciobanu5,
- Jan-Fang Cheng5,
- Natalia Ivanova5,
- Thomas Rattei1 and
- Parag Vaishampayan3Email author
© Weinmaier et al. 2015
Received: 1 September 2015
Accepted: 29 October 2015
Published: 8 December 2015
Recent studies posit a reciprocal dependency between the microbiomes associated with humans and indoor environments. However, none of these metagenome surveys has considered the viability of constituent microorganisms when inferring impact on human health.
Reported here are the results of a viability-linked metagenomics assay, which (1) unveil a remarkably complex community profile for bacteria, fungi, and viruses and (2) bolster the detection of underrepresented taxa by eliminating biases resulting from extraneous DNA. This approach enabled, for the first time ever, the elucidation of viral genomes from a cleanroom environment. Upon comparing the viable biomes and distribution of phylotypes within a cleanroom and adjoining (uncontrolled) gowning enclosure, the rigorous cleaning and stringent control countermeasures of the former were observed to select for a greater presence of anaerobes and spore-forming microflora. Sequence abundance and correlation analyses suggest that the viable indoor microbiome is influenced by both the human microbiome and the surrounding ecosystem(s).
The findings of this investigation constitute the literature’s first ever account of the indoor metagenome derived from DNA originating solely from the potential viable microbial population. Results presented in this study should prove valuable to the conceptualization and experimental design of future studies on indoor microbiomes aimed at inferring impact on human health.
Over the past decade, numerous studies have reported correlations (of varying strengths and significance) between the microbial communities inhabiting indoor environments and the human microbiome. Most recently, Brooks et al. reported that microbes regularly found in hospitals were capable of colonizing infant guts and could profoundly affect human health . In addition, 16S rRNA gene analysis has been used to show that indoor environments accumulate potential human pathogens in much greater numbers than their surrounding outdoor environments . However, the composition of a given indoor microbiome has also been reported as being strongly influenced by both the architecture and control parameters (e.g., humidity, temperature, airflow, ventilation) of that particular facility . Capitalizing on antimicrobial attributes inherent in architectural design and control logistics is relevant and important to numerous industries, from hospitals to pharmaceutical, microprocessor, and spacecraft manufacturing.
Spacecraft hardware is assembled in controlled cleanroom environments. External to the actual cleanroom, there is an uncontrolled gowning area, i.e., a room in which personnel change into cleanroom garments and make preparations to enter the cleanroom. Due to the elevated extent of human activity, this enclosure is thought to be strongly influenced by the human microbiome. The cleanroom itself has previously been posited as representing an extreme environment , characterized by rigorous cleaning and bioburden control regimens, controlled humidity (45 ± 5 %) and temperature (25° C), and a paucity of available nutrients. As a proactive measure to monitor cleanliness and ensure mission integrity, researchers have been diligently cataloging the diverse microbial populations detected about spacecraft and their assembly facilities for decades . Therefore, the indoor microbiome pertaining to spacecraft assembly cleanrooms represents one of the best-studied indoor microbiomes in the literature. The microbial signatures held in this collection were recovered by both cultivation and 16S rRNA gene sequencing [5–11]. As is the case for many other environmental settings, cultivation-based analyses lack the resolution required to capture the entire breadth of microbial diversity housed in indoor environments. It has been estimated that a mere fraction of all microorganisms on Earth are capable of being cultivated in the laboratory . This is due, in large part, to an insufficient understanding of microbial metabolism, interactions (e.g., quorum sensing, symbiosis), and dormancy (e.g., viable but not cultivable status). Ribosomal RNA gene sequence analysis allows for a much higher resolution of microbial diversity profiles than cultivation, despite being limited by primer bias and the generation of phylogenetic information only (no direct metabolic inference). Consequently, environmental genomics based on nucleic acid targets has become an attractive technique for maximizing the coverage of microbial community profiles from indoor environments . However, these DNA-based techniques are incapable of distinguishing viable from dead microbial cells in the samples .
Controlled indoor microbiomes are influenced by several factors, including but not limited to routine facility maintenance and cleaning regimens, periodic acute bioburden reduction efforts (e.g., UV lights, vapor-phase H2O2), controlled humidity and temperature, and a paucity of available nutrients. Consequently, not all microbes can withstand the conditions they encounter in such environments. Recently, the findings of a 16S rRNA gene amplicon study conducted on cleanroom samples suggested that less than 10 % of the observed microbial signatures originated from living microorganisms . This work exploited the viability marker propidium monoazide (PMA), which is able to enter only microbial cells that have a compromised cell membrane . Once inside the compromised cell, PMA binds covalently to DNA molecules, thereby precluding downstream PCR amplification and detection. Previous studies convincingly demonstrated that surveys on microbiomes targeting nucleic acid signatures (e.g., 16S rRNA gene amplicon analysis or metagenomics) sans live/dead chemical markers fail to provide any information on the physiology or viability of the microorganisms from which the detected nucleic acids originated [10, 15, 16]. Consequently, metagenomic analyses based on total environmental DNA extracts do not render a meaningful understanding of the metabolic and/or functional characteristics of living microorganisms in indoor environments.
To overcome this hurdle in indoor microbiome research, we augmented, for the first time ever, metagenomic sequencing with the PMA-based viability assay. This enabled a comprehensive examination of the versatile genetic potential of living biological communities in indoor environments. The results and inferences generated in this study underscore the importance of live/dead chemical markers in studying controlled ecosystems. The experimental design and impactful insights presented here empower the conceptualization and execution of ongoing and future investigations of the indoor microbiome and its impact on human health.
Results and discussion
The viable indoor metagenome encompasses eukaryotes, bacteria, and viruses
No archaeal signatures were observed in the original metagenomic dataset. While archaea are known to colonize human skin and are thus readily introduced to indoor environments via shedding , the impact of their presence in spacecraft-associated cleanroom environments may have been overestimated in the past [6, 10, 19]. To date, studies have failed to show any evidence in support of archaea actively contributing to cleanroom environments, or posing any threat to cleanroom endeavors . At this time, therefore, archaea cannot be viewed as constituting a significant portion of the cleanroom microbiome.
Taxonomic assignments of metagenomic reads were compared to those presented in Mahnert et al. , a study based on 16S rRNA amplicon sequencing of the very same samples (Additional file 2: Table S2). In both studies, Acinetobacter spp. were observed in very high abundance in the spacecraft assembly facility (SAF) and gowning area (GA) samples. Also congruent between the two investigations was the elevated abundance of staphylococcus signatures in GA samples. The high abundance of Bacilli in SAF samples observed in the current study was not reported by Mahnert and co-workers. The differences in signature composition recorded between the two studies likely stem from subtle differences in sample preparation, possible primer bias in the PCR reactions, and the sampling of viral as well as eukaryotic DNA in the metagenomic analyses. While 16S rRNA gene amplicon sequencing can detect low abundant species like Archaea, metagenomic approaches are able to resolve a much more comprehensive understanding of the cleanroom biome, particularly abundant community members.
Genome reconstruction provides first ever evidence for the presence of viruses in the cleanroom environment
The taxonomic analysis of the metagenomes generated in this study identified a number of different viruses present in the samples. Two phages were detected, a Phi29-like virus and an unclassified Siphoviridae. In addition, several viruses associated with humans or other eukaryotes were detected, namely human herpesvirus 4, Cyclovirus TN12, Dragonfly cyclovirus 2, Hypericum japonicum-associated DNA virus, various Fecal-associated gemycircularviruses, and the Meles meles fecal virus.
The increased incidence of viral detection in PMA-treated samples is an intriguing finding, one which suggests that PMA preferentially selects for virions having an intact capsid. Another possibility is that certain phages incorporated themselves into the genomes of viable microorganisms as prophages. If this were indeed the case, however, one would expect to observe an elevated infection rate in the microorganisms that were viable. Unless demonstrated otherwise, the authors opine that such a phenomenon would stand in stark contrast to the actual function of viruses (infection and killing of the host). Ergo, we conclude that PMA treatment likely favors the detection of virions with intact capsids.
Indoor biomes are influenced by both the surrounding ecosystem and the human microbiome
Evaluating the bacterial diversity associated with cleanrooms via sequencing of 16S rRNA genes has led to two strong yet opposing opinions. Initial analyses of geographically distinct cleanrooms suggested that associated microbiomes were largely dependent on the surrounding ecosystem [5, 22, 23]. However, recent studies have claimed more and more congruency between the cleanroom microbiome and the human microbiome, though concrete evidence beyond 16S rRNA gene profile similarity remains elusive [7, 24, 25]. Considering that variation exists in the human skin microbiome due to differences in the biogeographical characteristics of people , the observed geographic dissimilarity of cleanroom microbiomes could be attributed to variability resulting from different personnel working in the cleanrooms.
Functional and taxonomic complexity of the viable cleanroom microbiome
Understanding the functional potential of the biological communities inhabiting cleanrooms is of importance to a number of industries, including medical, pharmaceutical, superconductor, and space exploration. Those charged with creating, imposing, and enforcing planetary protection policies and requirements have recently come to appreciate the resolving power of innovative molecular strategies to taxonomically and functionally characterize the microbial populations associated with the cleanrooms in which spacecraft are assembled . These endeavors help better estimate the risk of transporting life to foreign celestial bodies, as well as the probability of terrestrial microbiota surviving spaceflight and/or another celestial environment.
The variation observed across taxonomic clades and the influences of different ecosystems on the cleanroom microbiome suggest a fairly complex biological community. This is most likely a consequence of stochastic introduction of microorganisms to the cleanroom facility via the surrounding ecosystem and the shedding of skin from different personnel. Generally speaking, the skin microbiome has been shown to be dependent on the biogeography of the individual , which adds yet another level of complexity to the cleanroom ecosystem. A rank-abundance curve based on read abundances (Fig. 2) suggested a fairly simple community, with Bacillus and Clostridiales highly abundant in PMA-treated SAF samples (Fig. 2a) and the fungus Leotiomyceta dominant in PMA-treated GA samples (Fig. 2b). However, this analysis was somewhat limited in that it was predicated on genus, i.e., each genus represented numerous organisms, and thus an array of different genomes. For instance, at least 15 and 34 operational taxonomic units were reported for the highly abundant genera Bacillus and Clostridium, respectively, by another parallel study (data based on 16S rRNA gene amplicons of the very same samples ). These genera are thus representative of at least 15 and 34 different genomes. This observed variability in constituent microbial species, coupled with the detection of various highly abundant eukaryotes (Amoebozoa and fungi) having larger and more complex genomes, leads the authors to conclude that the viable contingent of the cleanroom microbiome is considerably more complex than previously estimated . This complexity hampers sequence assembly for genome reconstruction, as has also been observed for the skin microbiome . Future investigations will necessitate substantially deeper sequencing than has been performed here with very recently developed metagenomic tools that allow resolution at strain level .
Genetic evidence for fermentative and respiratory processes was inferred from KEGG annotations. Lactate and alcohol dehydrogenases detected in the metagenome may enable growth of microbes under oxygen-limited conditions via substrate-level phosphorylation. Anaerobic respiration was inferred from genes that encode nitrate and nitrite reductases. Energy generation via respiratory processes may have occurred via NADH dehydrogenases, cytochrome oxidases, and ATP synthases annotated in the metagenome.
Carbon metabolism was inferred from the detection of genes encoding enzymes involved in glycolysis and the TCA cycle. These metabolic processes not only generate ATP but also NADH, which is re-oxidized by either fermentative or respiratory processes (see above). Autotrophic metabolisms were inferred from the detected presence of ATP citrate lyase, a key enzyme for carbon fixation in bacteria operating the reverse TCA cycle. Also found were genes annotated as small subunits of the RuBisCO gene. Although this enzyme’s catalytic subunit is localized on the large chain (i.e., encoded on the marker gene), the presence of the small subunit of the most important enzyme in the Calvin-Benson-Bassham cycle suggests that some organisms may be able to fix carbon dioxide via this pathway. In oligotrophic cleanroom environments, the only readily available source of carbon for microbial proliferation is atmospheric CO2, rendering carbon fixation a particularly attractive strategy for the continued persistence and outgrowth of contaminant microorganisms. This metabolic capability has previously been reported in a handful of microbes isolated from cleanrooms . With respect to extraterrestrial environments targeted by future space exploration efforts, organic carbon is most likely limited and autotrophy might very well be the only type of metabolism capable of furnishing hitchhiking microbes with the molecular building blocks required to survive and proliferate.
Cleanroom maintenance significantly affects microbiome structure
The total microbiome and viable contingent thereof have very different taxonomic and functional features
In this investigation, PMA treatment was shown to dramatically alter the structure of recovered biomes, at various levels. At the community level, this viability assay significantly affected the entire, bacterial, eukaryotic, fungal, and viral communities irrespective of the metric applied (binary or abundance; Fig. 4). Permutational MANOVA (PERMANOVA) and multiresponse permutation procedure (MRPP) tests showed high congruency among each other and across the taxonomic groupings tested (genus level and family level). The greatest chance-corrected within-group agreements (Fig. 4a) of all of the tests performed were those for viability assay versus total biome of both fungi and viruses. These two taxonomic groups appeared to be very sensitive to PMA treatment, confirming that PMA-pretreatment did in fact affect the detectability of community members other than bacteria. PMA chemistry has previously been used to discern viable from dead fungi [28, 29] and viruses [30–33]. The authors are aware, however, that PMA-based viability assays are limited in their ability to accurately distinguish viable spores and archaea from their expired counterparts. This limitation might very well explain why no archaea were detected in any of the PMA-treated samples. However, Mahnert et al. detected archaeal signatures at very low abundance in the very same samples via amplicon sequencing . Future experiments might benefit from co-treatment with dithiothreitol, to promote the penetration of PMA into inactivated spores . Whether or not “all” non-viable cells are precluded from downstream molecular detection remains a point of heated debate, and concrete evidence one way or the other continues to elude the PMA community. Although signatures of the human genome decreased significantly after viability treatment (paired student’s t test, p value 0.025, Fig. 1), some were still found in treated samples. Since it can be assumed that all of the human cells found in these indoor environmental samples are not viable, PMA must sometimes struggle to permeate the (thick) glycocalyx cell walls enveloping these cells. On the other hand, human cells may still have an intact cell wall and thus escape PMA treatment. Nevertheless, the PMA chemistry appeared remarkably effective at manipulating the bias of a molecular reaction (and an entire investigation for that matter) in a favorable manner, i.e., towards the viable community members of interest and away from dead cells and large amounts of human DNA.
On the taxon level (Fig. 5a), the abundance of numerous genera decreased significantly when treated with PMA, while unclassified Aspergillaceae and unclassified Coxiellaceae increased markedly. This effect on Aspergillaceae is of particular interest, as these organisms have been shown to affect human health in indoor environments . The ability to more accurately gauge the abundance of these and other pathogens sans artifacts and bias resulting from the DNA of dead cells should be of interest to health and medical professionals. The authors believe that the enabling capabilities made possible by PMA treatment (resolution of functional, viral, and eukaryotic nucleic acid signatures) largely outweigh the limitations of such treatment on endospores. Therefore, we recommend the augmentation of viability assays as a complement to non-PMA treatment whenever screening for the taxonomic signatures of potentially viable pathogenic organisms. The viable biome resulting from cleanroom samples was far more laden with unclassified Clostridia signatures yet significantly depleted in unclassified Coxiellaceae signatures. This could be a consequence of the physiological flexibility (i.e., anaerobic growth and endospore formation) of Clostridia. The entire (i.e., viable + dead) biome resulting from cleanroom samples also exhibited a reduced abundance of unclassified Coxiellaceae and unclassified Bacilli signatures, whereas the abundance of unclassified Rhizobiales and unclassified Alphaproteobacteria increased significantly.
Changes in functional genetic potential were evaluated at the pathway level (Fig. 5b) while also considering KEGG orthologs (KO; Additional file 5: Figure S3). Functional differences between KO were often observed in pathways whose signature abundances were significantly altered. The detected abundance of most pathway signatures decreased in the viable biome (Fig. 5b, PMA vs. non-PMA). Focusing on cleanroom samples, the viability assay resulted in a slight increase in cell communication signatures and a marked decrease in genes involved in regulation of autophagy, signaling, genetic information processing, and pyruvate and nucleotide metabolism (Fig. 5b, SAF: PMA vs. non-PMA). Of all of the PMA-treated samples analyzed, cleanroom samples were markedly depleted in peroxisome and pyruvate metabolism pathways (Fig. 5b, PMA: SAF vs. GA).
In conclusion, the results of analyses of taxonomic and functional variability indicate that the gowning area harbors more strictly aerobic and non-spore-forming taxa, while the cleanroom is richer in facultative and obligate anaerobes and spore-forming taxa. These results are in good agreement with findings presented in . Also, the functional profile of the cleanroom biome suggests that this population might be less dependent on oxygen for energy generation and slightly more amenable to other sources, such as nitrogen. Focusing on the viable portion of the microbial community is advantageous for many reasons. Quelling the DNA molecules originating from dead cells imposes a bias in favor of detecting signatures arising from the viable cells of interest. This is of immense importance as researchers attempt to accurately infer microbiome composition and/or function from a given biotope. In natural environments not undergoing drastic changes, the majority of microorganisms exist in an active, viable state , while the majority of signatures recovered from indoor microbiomes, cleanrooms in particular, originate from non-viable microorganisms . Understanding the natural status (i.e., viable vs. non-viable) of source organisms is crucial when inferring risk to human health from environmental samples (intensive care units; ) via nucleic acid-based analyses. Results convincingly demonstrate that the cleanroom microbiome consists of bacteria, eukaryotes, and even viruses, and as such, is much more complex than was previously posited. Adding to this complexity, at least in part, is an appreciable reciprocal dependency on the human microbiome. The work described here provides a well-established infrastructure for future studies centered on the indoor microbiome and should prove of significant relevance to those interested in epidemiology, pharmaceutical manufacturing and packaging, and operating theater cleanliness or human health in general. Collectively, the experimental design, molecular techniques, and conclusions discussed here constitute the scientific literature’s first ever functional and taxonomic characterization of the viable indoor biome.
Samples were collected from floors of the Jet Propulsion Laboratory’s Spacecraft Assembly Facility (SAF; Pasadena, CA) and adjacent gowning area (GA) via wet surface wiping with biological sampling kits (BiSKit; QuickSilver Analytics, Abingdon, MD), as previously described . In total, ten samples were collected from the SAF and three samples were collected from the adjacent GA (1 m2 each), all in triplicate fashion. Negative controls (Sterile PBS prewash of all Sampling kits), handling controls (sampling kits briefly exposed to the ambient sampling environment), and other reagent controls (PBS, DNA extraction reagents) were also prepared. None of the control samples yielded enough DNA to construct metagenome libraries. Hence, these control samples were not considered for sequencing any further analysis. The SAF is a Class 100K certified cleanroom per Fed-Std-209E (equivalent ISO 14644-1 Class 8), within which, spacecraft hardware was actively being assembled at the time that samples were collected.
To minimize microbial contamination of the SAF floor, an all-purpose cleaning and degreasing agent (Kleenol 30, Accurate Industrial Supply, Inc., Cerritos, CA, Cat #: J-CC-00040) is routinely applied by maintenance personnel. Cleanroom surfaces were cleaned twice a day while spacecraft hardware was present and undergoing assembly. In addition, the cleanroom portion of the facility was maintained with stringent protocols geared towards minimizing the influx of particulate matter, including HEPA filtration, the routine replenishment of tacky mats at points of ingress/egress, and daily vacuuming and mopping of floors. Prior to entering the cleanroom, personnel were required to take necessary precautions in the gowning area, including the donning of cleanroom garments, the gloving of hands, and the taping of gloves to garments. Hence, the gowning area was also sampled as a means of evaluating the extent to which microbes gain entry into the cleanroom via this portal.
Sample volumes were extracted from each BiSKit device in accordance with manufacturer-provided protocols. Biological materials from each 45 ml sample were concentrated with Amicon Ultra-50 Ultracel centrifugal filter tubes (Millipore, Billerica, MA). Each filter unit, having a molecular mass cutoff of 50 kDa, facilitated the concentration of cells, spores, and nucleic acid fragments greater than 100 bp. All concentrated samples (1 ml final) were divided into two separate 500 μL fractions, one to be treated with PMA prior to analysis (viability assessment), and the other to serve as a null environmental sample (viable + non-viable, i.e., total DNA).
Each 500 μl aliquot of filter-concentrated sample suspension to undergo viability assessment was treated with PMA (2 mM; Biotium, Inc., Hayward, CA) to a final concentration of 50 μM [16, 38], mixed thoroughly, and incubated in the dark for 5 min at room temperature. Tubes were inverted 5–6 times manually during the incubation to promote homogeneous PMA exposure. Both PMA-treated and non-treated samples were subjected to PMA photoactivation at room temperature for 15 min using a LED light source (λ = 464–476 nm, 60 W; PhAST Blue, GenIUL, Barcelona, Spain). To facilitate recovery of the broadest spectrum of recovered DNA molecules possible, one-half of the volume of each sample (250 μl) was subjected to bead beating in Lysing Matrix E tubes (60 s at 10 m/s) on a FastPrep®-24 (MP Biomedicals, Solon, OH, USA). Following agitation, respective sample fractions were combined (500 μl) and subjected to automated DNA extraction in a Maxwell® 16 instrument, in accordance with the manufacturer’s accordance with mPromega; Madison, WI). The DNA extracts resulting from the ten cleanroom samples were then pooled, as were those from the three gowning area samples. As samples were collected in triplicate from each sampling location, processing in this manner resulted in three representative samples each from the cleanroom and gowning area.
All manipulations were performed in a bleach-cleaned biohood, which resided in an ultra-clean laboratory environment (i.e., single-use lab coats, bleached gloves, booties, etc.). Each sample was divided into 1 μl aliquots, which were amplified via Multiple Displacement Amplification (MDA) using Repli-g single-cell whole genome amplification kit (Qiagen part #150345) according to the manufacturer’s instructions. Reaction mixture consisted of Phi29 Reaction Buffer (1X final concentration), 50 ng in hexamers with phosphoro- thioate modification of the two 3’-terminal nucleo-tides (IDT) , 0.4 mM dNTP, 5 % DMSO (Sigma), 10 mM DTT (Sigma), 100 U Phi29, and 0.5 μM Syto 13 (Invitrogen) in a final volume of 15 μl. A master mixture of MDA reagents was prepared and subsequently dispensed into Safe-Lock 1.5 ml clear microcentrifuge tubes (Eppendorf). Syto 13 was omitted from the master mixture as it is easily degraded by UV radiation. All plastic ware, water, lysis, and stop buffer were UV treated in a Stratalinker 2400 UV Crosslinker (Stratagene) with 254-nm UV for 30 to 90 min on ice . This represents a UV dose range of 5.7 to 17.1 J/cm2, calculated by measuring the distance from inside the tubes to the light bulb (4 cm). Following UV irradiation, master mixture was augmented with Syto 13 and dispensed into each well of a 384-well plate. MDA reactions were real-time monitored and stopped when sample amplification reached saturation.
Amplified fractions of each sample were combined, and this pooled DNA product (100 μl) was sheared using a Covaris E210 instrument (Covaris, Woburn, MA) set to 10 % duty cycle, intensity 5, and 200 cycles per burst for 1 min. The concentration and fragment size of each sheared product was determined using Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) in accordance with the manufacturer’s recommended conditions. The sheared DNA was end-repaired, A-tailed, and ligated to Illumina adaptors according to standard Illumina (Illumina, San Diego, CA) PE protocols. The concentration of the resulting Illumina-indexed libraries was again determined using Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). JPL samples GA-A, GA-B, GA-C, GA-A + PMA, GA-B + PMA, and GA-C + PMA were pooled into one library; JPL samples SAF-A, SAF-B and SAF-C, SAF-A + PMA, SAF-B + PMA, SAF-C + PMA were pooled into a second library. In this context, “pooling” refers to the barcoding and multiplexing of numerous sample sets into a single library. The pooled libraries were normalized to a final concentration of 400 mM each, and the primary bands corresponding to the sizes were gel-purified and dissolved in 30 μl TE. One flow-cell was generated from each pooled library, which was subsequently subjected to sequencing in an Illumina MiSeq instrument, in accordance with manufacturer-provided protocols. The raw sequence data are available within IMG/M (http://img.jgi.doe.gov/).
Sequence data analysis
MiSeq-generated paired-end reads 250 bp in length were merged using PEAR software (default parameters) , and both the merged reads and each of the non-merged reads (forward and reverse) were retained. FastQC  was used to determine the base quality throughout the reads, and all merged and non-merged reads were processed using prinseq-lite  with the parameters: “-min_len 100 -trim_qual_right 20-trim_qual_left 20-trim_left 8.” Adapter sequences and overrepresented homooligonucleotides were identified with the tool FastQC  and removed using Cutadapt . The remaining high-quality reads were mapped against the genome of the Illumina positive sequencing control, Bacteriophage PhiX174, and a JGI standard collection of potential contaminant genomes (Additional file 6: Table S3) using the BBMap short read aligner . Any reads matching any of these contaminant genomes were removed from the dataset. Remaining high-quality, non-contaminant reads were assembled using the Velvet , Ray Meta , and IDBAUD  assembly tools. The assemblies resulting from each of the three tools were of low value (largest contig, 2–17 Kb; N50, 0.6–1 Kb, coverage, 1–4), and as such, all subsequent analyses were based on unassembled read data.
All high-quality, non-contaminant reads were compared against NCBI non-redundant database (NR)  using RAPSearch2 , and results were imported into MEGAN (min score, 80; ). Read counts per taxon were exported for family and genus level, as were counts for functional assignments against KEGG on KEGG ortholog (KO) and pathway level. This represented the total abundance dataset. For various sample groups, genus level taxon abundances were summed, ranked, and normalized based on the total abundance of all taxa in the respective group of samples. The top ten taxa in each group of samples were then plotted in a rank-abundance curve.
Bacterial taxa were compared to the results from Mahnert et al. , which are based on the same samples. Genus level taxon abundances were summed and normalized based on the total abundance of all taxa in the respective samples of both studies, and top 20 taxa for each sample were extracted.
For univariate statistics, human reads were removed from the dataset. Therefore, all high-quality, non-contaminant reads were mapped against the human assembly “GRCh38” (including the mitochondrial genome) using BBMap , and matching reads were removed. Remaining reads were compared against NCBI NR  using RAPSearch2 , and the results were imported into MEGAN (min score, 80; ). For each sample, the “primate” sub-branch was removed, and then read counts per taxon, as well as for functional assignment against KEGG on KO and pathway level, were determined as described above. This represented the non-human abundance dataset.
High-quality non-contaminant reads were also mapped against all viral genomes in NCBI RefSeq  using BBMap . Reads matching to viral genomes were extracted, grouped by environment (SAF or GA), and assembled with the metagenome assembler Ray Meta . For each environment (SAF or GA), the reads used for assembly were mapped to the resulting contigs to derive coverage and validate the assembly. Assembled contigs were then compared to the NCBI NR database via BlastX  and aligned against the genome sequences of the best BlastX hits using MAUVE . Capsid proteins detected by BlastX in each of the contig subsets were aligned to amino acid sequences of capsid homologs in closely related taxa using Muscle  with default parameters, and a maximum-likelihood phylogenetic tree was constructed from the alignment with FastTree 2 using default parameters .
Taxonomically and functionally classified sequences were analyzed using the R programming environment . Multivariate statistics were based on rarefaction of the non-human abundance dataset to the lowest amount of reads of all samples. Rarefaction, followed by calculating the Bray-Curtis or Sorensen distance, was performed 10,000 times, and the average distance was calculated. Tests using this averaged distance spanned PERMANOVA (Adonis testing), MRPP, and principal coordinate analysis (PCoA) calculated using the R-vegan package . The according R script can be found in the supplementary (Additional file 7: Zipfile S1).
Calculations for determining significantly increased/depleted taxa were based on log10-transformed sequence abundance data (normalized by number of reads) and included paired t tests (when pairing was possible due to PMA treatment) and Welch tests for non-paired data (comparisons across non-paired samples, e.g., cleanroom vs. gowning area). Additionally, a permutation test was carried out to check for false discovery. Abundance differences for significant taxa were visualized as a heatmap of Z-scores.
Correlations between abundance of human and every classified taxon were calculated based on the total abundance dataset using Spearman’s correlation coefficient. Taxa within the “Eumetazoa” lineage were excluded from the correlation analysis, as these were likely to represent nonspecific human sequences. Abundance data was log10-transformed and normalized by number of reads. Human abundance per non-PMA sample was determined by summing up abundances of all taxa that contain “Primates” in their lineage. Human abundances in non-PMA samples were correlated with the abundances in PMA samples for each taxon by Spearman’s correlation coefficient.
Functional annotations were derived from MEGAN . Significantly increased/depleted pathways and KEGG orthology (KO) were based on log10-transformed non-human sequence abundance data (normalized by number of reads) and were identified by paired t tests and Welch tests as described above for taxa. A permutation test was carried out to check for false discovery, and abundance differences for significant pathways/KOs were visualized as a heatmap of Z-scores. The complete set of KO annotations was searched for key terms related to stress response and DNA repair. Search terms used were “sulfoxide,” “thioredoxin,” “homologous,” “repair,” “sbc,” “recombination,” “exopolysaccharide,” “glycosylase,” “heat,” and “cold.” The relative abundance of enzymes annotated with these key terms, as well as enzymes contained in the KEGG pathways “Carbon fixation in prokaryotes,” “Nitrogen metabolism,” and “Sulfur metabolism” were calculated by dividing absolute abundance values by the total number of functional annotations. Coverage of KEGG pathway maps were manually inspected using MEGAN.
Availability of data and materials
The data sets supporting the results of this article are available within IMG/M (http://img.jgi.doe.gov/).
Part of the research described in this study was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration. This research was funded by NASA Research Announcement (NRA) ROSES 2011 awarded to PV and NI.
The authors are grateful to Drs. Catharine Conley and Melissa Jones for valuable discussion and oversight. The authors also thank Alexander Mahnert, Jessica Cisneros, and Christa Pennacchio for assistance with sample collection, processing, and management. Copyright ® 2015; California Institute of Technology
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Brooks B, Firek BA, Miller CS, Sharon I, Thomas BC, Baker R, et al. Microbes in the neonatal intensive care unit resemble those found in the gut of premature infants. Microbiome. 2014;2:1.PubMed CentralView ArticlePubMedGoogle Scholar
- Lax S, Smith DP, Hampton-Marcell J, Owens SM, Handley KM, Scott NM, et al. Longitudinal analysis of microbial interaction between humans and the indoor environment. Science. 2014;345:1048–52.PubMed CentralView ArticlePubMedGoogle Scholar
- Kembel SW, Jones E, Kline J, Northcutt D, Stenson J, Womack AM, et al. Architectural design influences the diversity and structure of the built environment microbiome. ISME J. 2012;6:1469–79.PubMed CentralView ArticlePubMedGoogle Scholar
- La Duc MT, Venkateswaran K, Conley CA. A genetic inventory of spacecraft and associated surfaces. Astrobiology. 2014;14:15–23.View ArticlePubMedGoogle Scholar
- Moissl C, Osman S, La Duc MT, Dekas A, Brodie E, DeSantis T, et al. Molecular bacterial community analysis of clean rooms where spacecraft are assembled. FEMS Microbiol Ecol. 2007;61:509–21.View ArticlePubMedGoogle Scholar
- Moissl C, Bruckner JC, Venkateswaran K. Archaeal diversity analysis of spacecraft assembly clean rooms. ISME J. 2008;2:115–9.View ArticlePubMedGoogle Scholar
- Stieglmeier M, Wirth R, Kminek G, Moissl-Eichinger C. Cultivation of anaerobic and facultatively anaerobic bacteria from spacecraft-associated clean rooms. Appl Environ Microbiol. 2009;75:3484–91.PubMed CentralView ArticlePubMedGoogle Scholar
- Probst A, Vaishampayan P, Osman S, Moissl-Eichinger C, Andersen GL, Venkateswaran K. Diversity of anaerobic microbes in spacecraft assembly clean rooms. Appl Environ Microbiol. 2010;76:2837–45.PubMed CentralView ArticlePubMedGoogle Scholar
- La Duc MT, Osman S, Vaishampayan P, Piceno Y, Andersen G, Spry JA, et al. Comprehensive census of bacteria in clean rooms by using DNA microarray and cloning methods. Appl Environ Microbiol. 2009;75:6559–67.PubMed CentralView ArticlePubMedGoogle Scholar
- La Duc MT, Vaishampayan P, Nilsson HR, Torok T, Venkateswaran K. Pyrosequencing-derived bacterial, archaeal, and fungal diversity of spacecraft hardware destined for Mars. Appl Environ Microbiol. 2012;78:5912–22.PubMed CentralView ArticlePubMedGoogle Scholar
- Vaishampayan P, Probst AJ, La Duc MT, Bargoma E, Benardini JN, Andersen GL, et al. New perspectives on viable microbial communities in low-biomass cleanroom environments. ISME J. 2013;7:312–24.PubMed CentralView ArticlePubMedGoogle Scholar
- Tyson GW, Banfield JF. Cultivating the uncultivated: a community genomics perspective. Trends Microbiol. 2005;13:411–5.View ArticlePubMedGoogle Scholar
- Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004;428:37–43.View ArticlePubMedGoogle Scholar
- Nocker A, Sossa-Fernandez P, Burr MD, Camper AK. Use of propidium monoazide for live/dead distinction in microbial ecology. Appl Environ Microbiol. 2007;73:5111–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Nocker A, Sossa KE, Camper AK. Molecular monitoring of disinfection efficacy using propidium monoazide in combination with quantitative PCR. J Microbiol Methods. 2007;70:252–60.View ArticlePubMedGoogle Scholar
- Nocker A, Richter-Heitmann T, Montijn R, Schuren F, Kort R. Discrimination between live and dead cellsin bacterial communities from environmental water samples analyzed by 454 pyrosequencing. Int Microbiol Off J Span Soc Microbiol. 2010;13:59–65.Google Scholar
- La Duc MT, Dekas A, Osman S, Moissl C, Newcombe D, Venkateswaran K. Isolation and characterization of bacteria capable of tolerating the extreme conditions of clean room environments. Appl Environ Microbiol. 2007;73:2600–11.PubMed CentralView ArticlePubMedGoogle Scholar
- Probst AJ, Auerbach AK, Moissl-Eichinger C. Archaea on human skin. PLoS One. 2013;8:e65388.PubMed CentralView ArticlePubMedGoogle Scholar
- Moissl-Eichinger C. Archaea in artificial environments: their presence in global spacecraft clean rooms and impact on planetary protection. ISME J. 2011;5:209–19.PubMed CentralView ArticlePubMedGoogle Scholar
- Oh J, Byrd AL, Deming C, Conlan S. NISC comparative sequencing program, Kong HH, Segre JA: biogeography and individuality shape function in the human skin metagenome. Nature. 2014;514:59–64.PubMed CentralView ArticlePubMedGoogle Scholar
- Li L, Kapoor A, Slikas B, Bamidele OS, Wang C, Shaukat S, et al. Multiple diverse circoviruses infect farm animals and are commonly found in human and chimpanzee feces. J Virol. 2010;84:1674–82.PubMed CentralView ArticlePubMedGoogle Scholar
- La Duc MT, Nicholson W, Kern R, Venkateswaran K. Microbial characterization of the mars odyssey spacecraft and its encapsulation facility. Environ Microbiol. 2003;5:977–85.View ArticlePubMedGoogle Scholar
- La Duc MT, Kern R, Venkateswaran K. Microbial monitoring of spacecraft and associated environments. Microb Ecol. 2004;47:150–8.View ArticlePubMedGoogle Scholar
- Stieglmeier M, Rettberg P, Barczyk S, Bohmeier M, Pukall R, Wirth R, et al. Abundance and diversity of microbial inhabitants in European spacecraft-associated clean rooms. Astrobiology. 2012;12:572–85.View ArticlePubMedGoogle Scholar
- Moissl-Eichinger C, Pukall R, Probst AJ, Stieglmeier M, Schwendner P, Mora M, et al. Lessons learned from the microbial analysis of the Herschel spacecraft during assembly, integration, and test operations. Astrobiology. 2013;13:1125–39.View ArticlePubMedGoogle Scholar
- Mahnert A, Vaishampayan P, Probst AJ, Auerbach A, Moissl-Eichinger C, Venkateswaran K, et al. Cleanroom maintenance significantly reduces abundance but not diversity of indoor microbiomes. PLoS One. 2015;10(8):e0134848.PubMed CentralView ArticlePubMedGoogle Scholar
- Sharon I, Kertesz M, Hug LA, Pushkarev D, Blauwkamp TA, Castelle CJ, et al. Accurate, multi-kb reads resolve complex populations and detect rare microorganisms. Genome Res. 2015;25(4):534–43. doi:10.1101/gr.183012.114.PubMed CentralView ArticlePubMedGoogle Scholar
- Agustí G, Fittipaldi M, Morató J, Codony F. Viable quantitative PCR for assessing the response of Candida albicans to antifungal treatment. Appl Microbiol Biotechnol. 2013;97:341–9.View ArticlePubMedGoogle Scholar
- Andorrà I, Esteve-Zarzoso B, Guillamón JM, Mas A. Determination of viable wine yeast using DNA binding dyes and quantitative PCR. Int J Food Microbiol. 2010;144:257–62.View ArticlePubMedGoogle Scholar
- Fittipaldi M, Rodriguez NJP, Codony F, Adrados B, Peñuela GA, Morató J. Discrimination of infectious bacteriophage T4 virus by propidium monoazide real-time PCR. J Virol Methods. 2010;168:228–32.View ArticlePubMedGoogle Scholar
- Kim SY, Ko G. Using propidium monoazide to distinguish between viable and nonviable bacteria, MS2 and murine norovirus. Lett Appl Microbiol. 2012;55:182–8.View ArticlePubMedGoogle Scholar
- Parshionikar S, Laseke I, Fout GS. Use of propidium monoazide in reverse transcriptase PCR to distinguish between infectious and noninfectious enteric viruses in water samples. Appl Environ Microbiol. 2010;76:4318–26.PubMed CentralView ArticlePubMedGoogle Scholar
- Sánchez G, Elizaquível P, Aznar R. Discrimination of infectious hepatitis A viruses by propidium monoazide real-time RT-PCR. Food Environ Virol. 2012;4:21–5.View ArticlePubMedGoogle Scholar
- Jarvis BB, Miller JD. Mycotoxins as harmful indoor air contaminants. Appl Microbiol Biotechnol. 2005;66:367–72.View ArticlePubMedGoogle Scholar
- Nocker A, Fernández PS, Montijn R, Schuren F. Effect of air drying on bacterial viability: a multiparameter viability assessment. J Microbiol Methods. 2012;90:86–95.View ArticlePubMedGoogle Scholar
- Oberauner L, Zachow C, Lackner S, Högenauer C, Smolle K-H, Berg G. The ignored diversity: complex bacterial communities in intensive care units revealed by 16S pyrosequencing. Sci Rep. 2013;3:1413.PubMed CentralView ArticlePubMedGoogle Scholar
- Kwan K, Cooper M, La Duc MT, Vaishampayan P, Stam C, Benardini JN, et al. Evaluation of procedures for the collection, processing, and analysis of biomolecules from low-biomass surfaces. Appl Environ Microbiol. 2011;77:2943–53.PubMed CentralView ArticlePubMedGoogle Scholar
- Rawsthorne H, Dock CN, Jaykus LA. PCR-based method using propidium monoazide to distinguish viable from nonviable Bacillus subtilis spores. Appl Environ Microbiol. 2009;75:2936–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Dean FB, Hosono S, Fang L, Wu X, Faruqi AF, Bray-Ward P, et al. Comprehensive human genome amplification using multiple displacement amplification. Proc Natl Acad Sci U S A. 2002;99:5261–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Woyke T, Sczyrba A, Lee J, Rinke C, Tighe D, Clingenpeel S, et al. Decontamination of MDA reagents for single cell whole genome amplification. PLoS One. 2011;6:e26161.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: a fast and accurate illumina paired-End reAd mergeR. Bioinforma Oxf Engl. 2014;30:614–20.View ArticleGoogle Scholar
- FastQC: a quality control tool for high throughput sequence data [http://www.bioinformatics.babraham.ac.uk/projects/fastqc/]
- Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinforma Oxf Engl. 2011;27:863–4.View ArticleGoogle Scholar
- Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–2.View ArticleGoogle Scholar
- BBMap: short read aligner for DNA and RNA-seq data [http://sourceforge.net/projects/bbmap/]
- Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Boisvert S, Raymond F, Godzaridis E, Laviolette F, Corbeil J. Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol. 2012;13:R122.PubMed CentralView ArticlePubMedGoogle Scholar
- Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinforma Oxf Engl. 2012;28:1420–8.View ArticleGoogle Scholar
- NCBI Resource Coordinators: database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2014Google Scholar
- Zhao Y, Tang H, Ye Y. RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinforma Oxf Engl. 2012;28:125–6.View ArticleGoogle Scholar
- Huson DH, Weber N. Microbial community analysis using MEGAN. Methods Enzymol. 2013;531:465–85.View ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.View ArticlePubMedGoogle Scholar
- Darling AE, Mau B, Perna NT. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5:e11147.PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113.PubMed CentralView ArticlePubMedGoogle Scholar
- Price MN, Dehal PS, Arkin AP. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5:e9490.PubMed CentralView ArticlePubMedGoogle Scholar
- R Core Team: R: a language and environment for statistical computing. 2014.Google Scholar
- Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O’Hara RB, et al. Vegan: community ecology package. 2014.Google Scholar