Human research participants
Human research participants were sampled at Johns Hopkins Bayview Medical Center according to a protocol approved by the Johns Hopkins and the U.S. Army Human Research Protection Office Institutional Review Boards. The purpose of this study was to generate microbial ecology data for the skin microbiome. Participants were healthy volunteers, aged 18–50, with no history of chronic skin conditions or autoimmune diseases. Research participants were asked to not shower/bathe 2 days prior to sampling and answered a 400-point questionnaire (Additional file 1 Table S1) and then two swabs from each site were collected. To quantify skin properties, commercially available probes were used adjacent to swab sites in conjunction with the Multi Probe Adapter 10 system (Courage and Khazaka, GmbH) according to manufacturer instructions. Specifically, a Sebumeter was used to quantify sebum content and a Tewameter TM300 was used to quantify transepidermal water loss. Metadata for each volunteer is included in Additional file 1 Table S1 and summarized briefly in Fig. 1. Isolates data were not compared to subject metadata. There was no clinical intervention in this study for which to group volunteers and perform additional analyses.
Sample plating, growth conditions, and isolations
Healthy human research participants were sampled from the forehead, forearm, or antecubital fossa (inner elbow) by swabbing with cotton swabs saturated in 50 mM Tris (Amresco), 1 mM EDTA (Amresco), 0.5% Tween 20 (Sigma) in nuclease-free water for 30 s. For all participants, swabs were plated immediately on blood agar (Hardy Diagnostics) and incubated until colonies were observed. For four selected individuals, a second adjacent swab from the same site was added to 4 ml sterile TSB and mixed to resuspend bacterial samples for further plating, as described in Table 1.
From each plate, all phenotypically distinct colonies were picked onto fresh media for isolation. Single colonies were picked and re-streaked at least three times to isolate individual strains. Strains were grown in TSB, DSMZ 73 or on plates under corresponding aerobic/anaerobic and temperature conditions and frozen at – 80 °C in 25% glycerol. A subset of representative isolates (Additional file 1 Table S2) has been deposited at BEI resources (https://www.beiresources.org) for curation and distribution to the scientific community.
Isolate identification and 16S gene analysis
Aliquots of glycerol stocks were processed at Genewiz LLC (South Plainfield, NJ, USA) by colony PCR of the full 16S rRNA gene and subsequent Sanger sequencing. Resulting forward and reverse reads were merged using the EMBOSS merger , and then merged reads were aligned using the assignment and classification functions at SINA  to available databases using default parameters (Additional file 1 Table S3). Low quality and unmerged reads were manually curated, and then forward and reverse reads were classified using BLAST.
Generation of 16S and ITS phylogenetic trees
In addition to the 16S sequences from isolates, 16S sequences from 83 reference strains were added to help speciate isolate groups (Additional file 1 Table S4). To facilitate tree generation, 16S sequences that did not contain at least 1280 unambiguous nucleotides were removed. The remaining sequences were then aligned in a multiple sequence alignment using MAFFT (version 7.123b)  along with Escherichia coli K-12 for identification of the variable regions. All sequences were then trimmed to the variable regions V2–V8 using the E. coli reference [41,42,43] to normalize due to quality issues in some V1 regions. These truncated 16S sequences (739 sequences) were analyzed using BLAST with the SILVA SEED database (release 132), and the full-length 16S sequence from the top BLAST hit for each isolate was used as reference sequences (95 total). Reference identity was taken from the corresponding SILVA entry except for the asterisked sequences which were classified using the SILVA classification service . The isolates and reference sequences were aligned and trimmed to the V2–V8 region using the default settings of MAFFT (version 7.123b) . Isolates with more than 5 unidentifiable bases (.5%) within this region were not included in the tree (46 sequences). Sequences were realigned with Infernal (1.1.2)  using the bacterial SSU rRNA covariance matrix downloaded from Rfam. Initial tree making was run with FastTree (version 2.1.7). Tree refinement was run with RAxML (version 8.2.0), first using the rapid hill-climbing method with the GTRCAT substitution model with bootstrapping. Further refinement was run with the RAxML  model and branch length optimization method using the GAMMA substitution model. Trees were visualized using the r2d3 package in R.
Molds were identified by tissue growth on plates and maintained by plug passaging, and yeasts were identified by microscopy and maintained as streak cultures. gDNA was extracted from fungal samples using the DNEasy UltraClean Microbial DNA isolation kit (QIAGEN) using 4x bead-beating/freeze-thaw cycles to lyse cells: flash freeze in liquid nitrogen, heat to 65 °C, and bead beat in the TissueLyser II for 10 min at 25 Hz. For yeasts, lysing was replaced by colony PCR. Internal transcribed spacer (ITS) regions were amplified using published primers F-5′-GTAAAAGTCGTAACAAGGTTTC and R-5′-GTTCAAAGAYTCGATGATTCAC ( ITS1) and F-5′-GTGAATCATCGARTCTTTGAAC and R-5′-TATGCTTAAGTTCAGCGGGTA (ITS2) . PCR products were purified using magnetic beads then sequenced at Genewiz using the forward and reverse primers. Reads were merged using the EMBOSS merger , and then the top hit from BLAST analysis was used to determine putative species. Identification of fungal strains is included in Table S4.
Preparation of skin compound utilization assay
Skin-relevant compounds were selected based on a literature search for compounds detected in sweat, sebum, and as residual skin surface chemicals. All skin-relevant compounds assessed, their sources and literature sources citing their presence on the skin are listed in Additional file 1 Table S6. To prepare the assay plates, compounds were dissolved in molecular biology-grade water or chloroform (Fisher Scientific), at 10 mg/ml. Stock solutions were distributed into polypropylene 96-well plates. Negative controls consisted of three wells containing only water and three wells containing only chloroform. A positive growth control consisted of three wells with 10% TSB in water. Water and chloroform were evaporated so that assay plates contained only 0.3 mg of a single carbon source per well. Plates were stored covered at 4 °C until use.
Bacteria culturing and preparation of assay inoculation cultures
Bacteria were stored in 10% glycerol stocks at – 80 °C. To generate starter cultures, glycerol stocks were streaked individually on TSA plates and incubated at 30 °C. Single colonies were picked to inoculate 5 ml of TSB, which was incubated at 30 °C with shaking for 1 to 3 days. To prepare the assay inoculation culture, starter cultures were diluted ~1:500 and cultured for approximately 4 h in TSB. Bacteria pellets were washed three time by centrifugation at 4300 G for 10 min, aspiration of the supernatant, and gentle resuspension in an essential salt solution adapted from Bochner et al.  consisting of 100 mM sodium chloride, 30 mM triethanolamine, 25 mM sodium pyruvate, 5.0 mM ammonium chloride, 2.0 mM monosodium phosphate, 0.25 mM sodium sulfate, 0.05 mM magnesium chloride, 1.0 mM potassium chloride, and 1.0 μM ferric chloride (all reagents were purchased from Sigma). The optical density at 600 nm (OD) of the final bacterial solution was measured using a Nanodrop 2000c spectrophotometer (Thermo Scientific) and bacteria were brought to an assay OD of 0.001 supplemented with Biolog dye mix A (Biolog, Inc.) to achieve a 1× concentration immediately before the assay.
Assay for skin compound utilization
Bacterial solution (0.2 ml) was added to each well of the assay plate, and sterile water was introduced in the spaces between wells to increase local humidity. Plates were incubated at 30 °C in a humidity chamber without shaking. Assessment of compound utilization was measured at OD at 590 nm using a CLARIOstar plate reader (BMG Labtech) immediately after plate preparation and 72 h later. At least three assay plates were examined for each bacterial isolate. Absorbance values from the 0 time point were subtracted from the 72 h values to yield background subtracted values. Compound utilization was then assessed by an ANOVA followed by a Dunnett’s test against the negative control with a cut-off p value of 0.05 using JMP® (Version 13.0.0, SAS Institute Inc., Cary, NC, 1989–2019). If positive control wells did not show growth for a bacterial isolate, the bacterial concentration was increased 10-fold up to two times. Each set of assays included a plate with the essential salt solution and Biolog dye mix A without a bacterial inoculation to ensure sterility of the assay plate. Compounds were classified using the BioCyc database , where the most specific parent class was chosen that allowed for molecule classifications with three or more compounds. Compounds not in the BioCyc database were classified with the ClassyFire tool . Any compounds unable to be categorized by either tool are grouped into the ‘miscellaneous’ category.
Phylogenetic distance analysis
Carbon source utilization was compared to the phylogenetic similarity of microbial taxa using two approaches: (i) Mantel tests and (ii) linear regression models. For both approaches, phylogenetic similarity was measured using phylogenetic distances calculated from the trimmed V2–V8 16S rRNA sequences described above and the similarity in carbon source utilization was measured using the Jaccard index. For a given pair of microbial taxa, the Jaccard index equaled 1 if the two microbial taxa utilized the same set of carbon sources, 0 if they utilized completely different carbon sources, and values between 0 and 1 depending on the proportional overlap of carbon sources. A Mantel test returns the correlation between two matrices, in this case, between a square matrix of phylogenetic similarity values (each element was the phylogenetic similarity of a pair of microbial taxa) and a corresponding square matrix of Jaccard similarity values (each element was the Jaccard index for a pair of microbial taxa). With linear regression models, Jaccard similarity was used as the response variable and phylogenetic similarity as the predictor variable. Because most pairs of microbial taxa did not share any carbon sources, linear regression models were also fit but excluding pairs of microbial taxa that did not share at least one carbon source.