Characterization and comparison of bacterial communities in benign vocal fold lesions

Background Benign vocal fold lesions, including cysts, nodules, polyps, and Reinke’s edema, are common causes of hoarseness and subsequent voice disorders. Given the prevalence of these lesions, disease etiology and pathophysiology remain unclear and their microbiota has not been studied to date secondary to the paucity of available biopsies for investigation. We sought to characterize and compare the bacterial communities in biopsies of cysts, nodules, polyps, and Reinke’s edema collected from patients in Germany and Wisconsin. These samples were then compared to the communities found in healthy saliva and throat samples from the Human Microbiome Project (HMP). Results 454 pyrosequencing of the V3–V5 regions of the 16S rRNA gene revealed five phyla that explained most of the bacterial diversity, including Firmicutes (73.8%), Proteobacteria (12.7%), Bacteroidetes (9.2%), Actinobacteria (2.1%), and Fusobacteria (1.9%). Every lesion sample, regardless of diagnosis, had operational taxonomic units (OTUs) identified as Streptococcus, with a mean abundance of 68.7%. Most of the lesions, 31 out of 44, were indistinguishable in a principal coordinates analysis (PCoA) due to dominance by OTUs phylogenetically similar to Streptococcus pseudopneumoniae. Thirteen lesions not dominated by S. pseudopneumoniae were more similar to HMP throat and saliva samples, though 12 of them contained Pseudomonas, which was not present in any of the HMP samples. Community structure and abundance could not be correlated with lesion diagnosis or any other documented patient factor, including age, sex, or country of origin. Conclusions Dominance by S. pseudopneumoniae could be a factor in disease etiology, as could the presence of Pseudomonas in some samples. Likewise, decreased diversity, as compared to healthy saliva and throat samples, may be associated with disease, similar to disease models in other mucosal sites. Electronic supplementary material The online version of this article (doi:10.1186/2049-2618-2-43) contains supplementary material, which is available to authorized users.


Background
At any one time, an estimated 20.7 million people in the US report problems with their voice, while 93.8 million report having problems during their lifetime [1]. Additionally, an estimated 22.6 million of the general population reports missing one or more days of work annually because of voice problems [1]. Voice problems are also associated with negative effects on quality of life including impaired communication, social isolation, and decreased occupational productivity [2][3][4]. Vocal folds play a key role in voice production and protection of the lungs, vibrating to produce sound and closing to protect the lungs against food and liquid aspiration during swallowing. Tissue of the vocal folds is unique and complex, composed of an epithelial layer made up of non-keratinized stratified squamous epithelium undergoing constant regeneration, which is bound to a three-layered lamina propria via basement membrane zone anchoring fibers. Vocal folds are also immunologically active serving both as a physical barrier and as a site that may detect early shifts in microbial presence [5].
While voice problems are associated with many causes, benign vocal fold lesions are one of the most frequent medical diagnoses [6]. Benign vocal fold lesions are generally classified into two broad categories: non-neoplastic lesions and neoplastic lesions. Non-neoplastic vocal fold lesions make up the majority of benign lesions and include cysts, nodules, polyps, Reinke's edema, granulomas, ectasias, sulcus vocalis, and scar. Benign vocal fold lesions are typically distinguished by their phenotype, but there is limited knowledge of the etiology and progression of these diseases [7]. Mechanical, chemical, and thermal trauma; overuse and misuse; or combinations of these factors are thought to contribute to lesion development via remodeling of the lamina propria. Studies investigating histological differences and gene expression profiles have demonstrated differential stages of wound maturation and disordered wound healing [8][9][10]. While inflammation has historically been associated with polyps, Kotby et al. suggests that polyps and Reinke's edema may represent a continuum of vocal fold injury [11]. Many comorbidities have been associated with vocal fold lesions, including smoking, laryngopharyngeal reflux, infection, and allergies [12][13][14]. To date, microbial contributions to the etiology of vocal fold lesions have focused mostly on microbes stereotypically considered pathogens, such as human papillomavirus and Mycobacterium tuberculosis [15][16][17]. However, there remains a lack of data in the literature regarding microbial community membership for normal and diseased larynx [18], secondary to a paucity of available tissue for study.
The only study published to date on laryngeal bacterial communities suggested that laryngeal squamous cell carcinoma (LSCC) could be associated with shifts in bacterial communities in the larynx [19]. Changes in the abundance of 15 genera were associated with carcinoma versus control patients with vocal fold polyps. They postulated that increases in Fusobacterium and Prevotella might be involved in initiating biofilm formation and that it could contribute to disease progression. Similar work in other mucosal sites also suggests that changes in bacterial community membership and abundance, despite the absence of classically defined pathogens, may be associated with disease. Asthmatics have been found to have altered microbial communities in their lungs [20]. While the community members responsible for this change varied between patients, a shift towards Gram negative-dominated communities was associated with elevated lipopolysaccharide levels. The authors postulated that this may be associated with increased airway cell stimulation contributing to disease. Likewise, Wang et al. found that the abundance of particular operational taxonomic units (OTUs) were shifted in feces from patients with colorectal cancer, particularly a reduction in butyrate-producing bacteria from the family Lachnospiraceae [21]. They surmised that an increase in potential pathogens coupled with a decrease in butyrate-producing bacteria could play a role in tumor formation or contribute to colorectal cancer formation.
Shifts in bacterial communities have been associated with diseases of mucosal surfaces, and we postulated that similar changes might be found in benign vocal fold lesions. In this investigation we characterized and compared the bacterial communities of benign vocal fold lesions, including biopsies of cysts, nodules, polyps, and Reinke's edema. We hypothesized that distinct lesion types would be represented by distinct bacterial communities. This was achieved with 454 pyrosequencing of the V3-V5 region of the 16S rRNA gene using lesion biopsies in patients from the state of Wisconsin, USA, and Germany.

Clinical diagnoses and patient characteristics
Forty-nine patients diagnosed with benign vocal fold lesions were enrolled in this study. Forty-four lesions in total were collected and yielded sufficient sequences, including 7 cysts, 5 nodules, 18 polyps, and 14 Reinke's edema ( Table 1). Characteristics of these patients are detailed in Table 1, including mean age, gender, and country of origin. Five of the 49 initial samples yielded less than 500 sequences; these samples were eliminated from further analyses.

Bacterial communities associated with benign vocal fold lesions
Three 454 GS Junior runs resulted in a total of 248,579 high-quality sequences with a mean length of 471 bp after quality filtering and removal of primers and adapter sequences. A total of 44 samples were successfully pyrosequenced and yielded over 500 sequences. Samples had a mean of 5,650 sequences (range 892-23,996). The lowest Good's coverage was 0.979 suggesting that communities were well sampled. No significant differences were detected between the lesion types by Chao, inverse Simpson, or Shannon diversity indexes ( Table 2).
( Figure 1). Most of the sequences identified as Bacilli belonged to the genus Streptococcus. This genus was the most common and abundant, found in every lesion sample. Bacterial community comparison of individual samples supported the above observations. Communities did not cluster in the principal coordinates analysis (PCoA) plot based on lesion type ( Figure 2), gender, country of origin, or age (data not shown), nor any other documented clinical information. Dominance by Streptococcus is evident in Figure 2, where the 31 samples dominated by Streptococcus all cluster together.

Comparison of healthy HMP samples to lesion communities
Due to the unethical nature of taking biopsies from healthy human vocal folds, for further analyses we included 15 randomly selected saliva and throat samples from the Human Microbiome Project (HMP) as proxies for healthy vocal fold bacterial communities. Saliva was the most diverse sample type and was significantly different from all other samples by both inverse Simpson and Shannon (p < 0.0001) ( Table 3). Throat samples were intermediary, and the differences in diversity compared to lesion samples were not always statistically supported.
Tight clustering of lesion samples was not altered in the PCoA plot with the inclusion of the HMP samples ( Figure 3). However, some lesions had communities more similar to saliva and throat samples than to other lesions. OTUs identified as Pseudomonas were present in 12 of the 13 lesions that did not cluster based on dominance by Streptococcus, though Pseudomonas was found in only one throat sample.

The abundance of Streptococcus in lesion samples
At 97% sequence identity, one OTU identified as Streptococcus was present in every lesion sample at a mean abundance of 66.21% (range 0.76%-99.76%, 5.01 SE). The sequence selected by mothur to represent this OTU was identified as S. pseudopneumoniae. However, 16S sequences in the mitis group, of which S. pseudopneumoniae is a member, often have greater than 99% sequence identity [22,23]. We used the 100 most abundant   OTUs in clade II grouped with Streptococcus dentisani, a recently described Streptococcus species [24]. Clade III OTUs could not be identified to the species level, as some species in the mitis group are indistinguishable in this region of the 16S rRNA gene [23]. OTUs that grouped outside these clades were also present, but at very low abundance (data not shown).

Discussion
Maintenance of vocal fold function is essential for human health, where the consequences of impaired voice production hold profound implications for individual health and wellness, social and occupational function, and societal productivity [3,4]. Benign vocal fold lesions are one of the most common diagnoses when problems with voice arise, but a full pathophysiologic understanding of how and why these lesions form is still lacking. We sought to describe the bacterial communities in a large set of vocal fold biopsies that included cysts, nodules, polyps, and Reinke's edema. Surprisingly, the majority of our samples, 31 out of 44, had highly similar bacterial communities dominated by S. pseudopneumoniae, regardless of lesion type, or any other documented patient characteristic. To date, only one other study has looked at whole bacterial communities in the larynx, comparing LSCC to nearby "healthy" tissue, and a control group of patients with polyps [19]. Broad differences are apparent between their control polyp patients and ours, from the phyla level to the genus level, where we found considerably more Streptococcus (71% vs. 56%), but less of all other genera they deemed dominant, including Fusobacterium (0.6% vs. 8%), Prevotella (5% vs. 7%), Neisseria (2% vs. 5%), and Gemella (0.5% vs. 2%). It is widely accepted that even minor differences in sample preparation can lead to discrepancies in community composition, and the differences found between our polyp samples and those in Gong et al. could be due to any one of these factors, including the use of different primer pairs or DNA extraction protocols, and modifications to PCR protocols [19,[25][26][27][28][29][30][31][32]. The presence of Streptococcus is well noted in healthy oral and respiratory sites [33][34][35]. However, due to the high similarity of the 16S gene in many Streptococcus species, finer resolution based on non-full length sequences is difficult. A phylogeny of the OTUs identified  as Streptococcus in our study demonstrated that many of them were either S. pseudopneumoniae or S. pneumoniae. These two Streptococcus species vary by a single base pair in the region sequenced here, and a manual check of the alignment showed that all sequences in clade I contained a cytosine at that position, like S. pseudopneumoniae, and no OTU out of the top 100 could be identified as S. pneumoniae. Dominance by S. pseudopneumoniae in the majority of our samples may be playing a role in disease progression. This recently described member of the mitis group has many genes associated with host cell interaction, including some of which are thought to be virulence factors [36,37]. S. pseudopneumoniae, previously thought to be atypical S. pneumoniae, has been isolated from patients with a number of respiratory diseases including chronic obstructive pulmonary disease (COPD), cystic fibrosis, pneumonia, bronchitis, and chronic sinusitis [38][39][40]. However, its status as a pathogen, or mutualist, is still not clear. All 20 human subjects in a tonsillar crypt study were found to have S. pseudopneumoniae, regardless of tonsil health status [41].
Helicobacter pylori is perhaps one of the most prevalent "infections" in the world, estimated to colonize the stomach of over 50% of the world's human population [42]. Work in the last few decades has shown a direct relation between H. pylori and stomach cancer [42], and more recently, correlations have been made with other aerodigestive tract diseases, such as otitis media with effusion [43] and oral aphthous ulcers [44]. H. pylori has also been associated with LSCC [45][46][47], though there is debate about whether it is associated with benign vocal fold lesions [48] or not [47,49]. Only a few of the samples presented here contained Helicobacter and at very low abundances. Many studies seeking to tie vocal fold lesions to H. pylori infections have looked for the specific presence of the bacterium in the larynx [47], but not necessarily at its abundance. It could be that H. pylori is common in vocal folds, but at a low enough abundance that it was mostly missed in our patient population and with our sampling technique, even in those with previously diagnosed infections.
One limitation in our study design was the inability to compare these lesions to the microbial community that may be found in biopsied healthy human vocal folds. The lamina propria of the vocal folds is very thin, only 3 mm thick, and the possibility of creating vocal scar and impairing voice production is present with every surgical procedure including biopsy. As such, it is considered unethical to biopsy vocal folds of healthy human subjects. As a proxy, we included in our analyses 15 randomly selected saliva and throat samples on the basis that many of the habitats above the stomach have similar microbial communities due to the buffering nature of saliva, regular nutrient availability in the form of mucin, and a common epithelial lining (non-keratinized, stratified, squamous epithelium) [33]. Of the 15-18 body sites sampled by the HMP, non-tooth oral sites, including saliva and throat, were deemed to be highly similar [50]. Samples from these body sites might have the most in common with our samples, considering that they all face similar environmental exposures such as inhaled air, ingested substances (food and beverage), and constant contact via saliva and a continuous mucus layer. Indeed, 13 of our 44 samples were highly similar to these throat and saliva samples. However, the majority harbored a distinct community, one dominated by S. pseudopneumoniae, as described above. Interestingly, of the 13 samples more similar to saliva and throat, 12 of them contained Pseudomonas, a genus found in only 1 throat sample. However, many of the common and dominant genera found in the saliva and throat samples were also found in these 13 samples, including Actinomyces, Prevotella, Fusobacterium, Leptotrichia, Neisseria, Haemophilus, Gemella, Granulicatella, Oribacterium, Veillonella, unclassified Prevotellaceae, and unclassified Lachnospiraceae [33]. Pseudomonas has been associated with diseases of the lower respiratory system including bronchiolitis obliterans syndrome [51] and cystic fibrosis [52] and could be playing a role in benign vocal fold lesion etiology.

Conclusions
Benign vocal fold lesions are usually diagnosed based on phenotypic differences present upon videostroboscopic examination. While histological and gene expression differences have been noted, the data presented here adds to the idea that despite phenotypic differences, benign vocal fold lesions share many similarities, including highly similar bacterial communities that are distinct from healthy throat and saliva samples [8,9]. The possibility remains that these shifts in the bacterial community could inhibit wound healing, or that the presence of inflammation creates a niche for a community dominated by S. pseudopneumoniae.
(See figure on previous page.) Figure 4 Phylogenetic relationship and abundance of the top 100 unique Streptococcus OTUs. The top 100 unique Streptococcus OTUs along with 21 reference Streptococcus sequences were phylogenetically analyzed. Three major clades were found. Sequences from clade I dominated the lesion samples while being completely absent from HMP samples. Panel inset represents the mean relative abundance of all Streptococcus OTUs in cysts, nodules, polyps, Reinke's edema, saliva, and throat, in addition to the portion represented by clades I, II, and III. Error bars represent standard error.

Subjects and collection of benign lesions
Benign vocal fold lesions were collected at University of Wisconsin-Madison, USA, and the University of Hamburg, Germany. Laryngeal microsurgical techniques were used to remove lesions. For each lesion, the clinical diagnosis was made based on initial videostroboscopic exam and confirmed under direct visualization by the surgeon at the time of surgical removal. Samples were immediately placed in RNAlater (Ambion Inc., Austin, Texas) and stored at −80°C until use. The University of Wisconsin Madison Health Sciences IRB and the University of Hamburg Ethics Committee approved the protocol for attainment of all tissue samples. All subjects provided written consent to participate in this study. Meta data for each sample is detailed in Additional file 1: Table S1.
DNA extraction and PCR for 454 pyrosequencing DNA was extracted from tissue with the EpiCenter MasterPure Complete DNA and RNA Purification Kit (Illumina, Madison, WI) with modifications to the manufacturer's protocol. Samples were gently thawed at room temperature and briefly centrifuged to collect tissue. RNAlater was removed by pipetting and 300 μl of tissue and cell lysis solution was added to the tube with the tissue. Lysis solution and tissue were then transferred to a sterile screw top tube containing 150-200 mg of 400 μM silica beads. Proteinase K, 100 μg, was added; the tubes were vortexed and incubated at 55°C for 1 h, with vortexing every 15 min. Bead tubes were then shaken in a horizontal adapter on the vortex for 10 min. Rnase A, 5 μg, was added; the tubes were vortexed and incubated at 37°C for 30 min. The remainder of the manufacturer's protocol was followed as written. DNA was resuspended in TE buffer and stored at 4°C until use.
PCRs were performed in triplicates containing 50-100 ng of template DNA, 0.2 μl AccuPrime Taq DNA Polymerase (Life Technologies, Grand Island, NY), 2.5 μl Buffer II, 400 nM both forward and reverse primers, and water to 25 μl total. Thermocycling conditions were as follows: 95°C 2 min, followed by 30 cycles of 95°C 20 s, 56°C 30 s, 72°C 1 min, and a final extension of 72°C 8 min. The primers included 357 F and 926R, as suggested by the HMP [53], where 357 F contained the B adapter for 454 pyrosequencing, and 926R contained both the A adapter and a 10-base pair multiplex identifier. Triplicate PCRs were pooled and cleaned using Purelink PCR purification kit (Invitrogen, Grand Island, NY) as per manufacturer's directions for removal of primer dimers and short PCR products <300 bp. Cleaned products were eluted in 30 μl of elution buffer. Samples were then gel extracted from a low-melt agarose gel using Zymoclean Gel DNA Recovery Kit (Zymo Research, Irvine, CA) by visualizing on a blue light transilluminator (Clare Chemical Research, Dolores, CO). Cleaned PCR products were quantified using a Qubit fluorometer (Invitrogen, Grand Island, NY). Products were diluted and pooled at equal concentrations for 454 pyrosequencing.
454 pyrosequencing, data analysis, and statistics 454 pyrosequencing was conducted on a Roche GS Junior (Roche, Indianapolis, IN) using titanium chemistry and long read modifications found in Hanshew et al. [26]. Samples were sequenced across three picotiter plates. Raw data were processed using mothur (v. 1.33.1) [54], with most of the defaults put forth in the Schloss 454 SOP (http://www.mothur.org/wiki/Schloss_SOP; accessed Jan 27, 2014) [55], but with minflows = 350 and maxflows = 720 [56]. Sequences were aligned to a Silva-derived reference data base (v. 102 as implemented for mothur) [57]. Chimeras were detected using UCHIME and removed [58]. Sequences were assigned to taxonomic groups using the Ribosomal Database Project (RDP)-derived reference database [59]. All eukaryotic and unclassifiable reads were removed after classify.seqs. Sequences were assigned to OTUs at 97% sequence identity, used to construct a distance matrix using theta Yue and Clayton values, and analyzed with PCoA plots in Prism. Good's coverage, Chao, inverse Simpson, and Shannon were calculated in mothur. One-way ANOVA with TukeyHSD p value correction for pairwise comparison was used to assess differences in Chao, inverse Simpson, and Shannon. In addition to lesion diagnosis, bacterial communities were also compared based on age, gender, date of surgery, and geographic location. The data sets supporting the results of this article are available in the NCBI sequence read archive, SRP047304.
Thirty HMP samples, including 15 haphazardly selected saliva and 15 throat (oropharynx) samples, were included in further analyses as proxies for "healthy" (Additional file 1: Table S1). As described above, Good's coverage, Chao, inverse Simpson, and Shannon diversity indexes were calculated along with theta Yue and Clayton values for PCoA at 97% sequence identity [53,60].