Hierarchical clustering analysis of all visits by community abundance and composition identified five unique bacterial community clusters named herein bacterial community groups (CGs), which were distinguished by
Lactobacillus-dominance (CG1 and 2) or higher diversity bacteriomes (CG3–5;
Figure 1A). Similar to other African cohorts
[4,8,55][4][8][18], the majority (
n = 173, 72.7%) of subjects had high diversity FRT bacteriomes with a low prevalence of
Lactobacillus-dominant bacteriomes. Samples that were dominated by a single species, defined as >50% community composition, made up 42.0% of all samples and were mostly clustered with CG1 and 2, while the remaining samples showed no individual dominant species and mainly clustered in CG3–5. CG1 (
n = 15, 6.3%;
Figure 1B), a low diversity FRT bacteriome, was comprised almost exclusively of
Lactobacillus species (78%) that were unable to be further delineated by 16S rRNA gene amplicon sequencing. To further define the predominant
Lactobacillus constituents of CG1, qPCR of 16S rRNA gene sequences from the key FRT
Lactobacillus species
L. iners, L. crispatus, L. gasseri, and
L. jensenii [9,55][9][18] was performed, revealing that approximately 50% of samples in CG1 were
L. crispatus-dominant, similar to what has been previously described as CST I
[8] (
Figure 1C). One vaginal swab in CG1 contained sufficient volume to only perform qPCR for
L. iners and
L. crispatus, and
L. crispatus was most abundant (not shown).
L. jensenii tended to predominate when present (
Figure 1C). CG2 (
n = 54, 22.7%) was
L. iners-dominant, with a few samples showing notable amounts of
Gardnerella and
Prevotella as well (
Figure 1B). The identified high diversity CGs 3–5 were compositionally different from the conventional CST4 subtypes
[8]. The second to largest and most diverse group, CG3 (
n = 64, 26.9%), consisted mainly of
Gardnerella, Prevotella, and
L. iners (
Figure 1B). The smallest high diversity CG was CG4 (
n = 30, 12.6%), in which
Shuttleworthia and
Gardnerella were predominant. CG5 contained the largest number of samples (
n = 75, 31.5%) and was dominated by
Sneathia and
Prevotella (
Figure 1B). CG3, CG4 and CG5 exhibited significantly higher alpha diversity than CG1 and 2 (
Figure 2A;
p = 0.0001, 0.0065, and <0.0001, respectively). Beta diversity significantly differed between these five bacterial CGs (
Figure 2B;
p = 0.0001).
Figure 2. FRT bacteriome clusters into distinct community groups (CGs) that differ by alpha and beta diversity. (A) Bacterial Shannon diversity (y-axis) by CG (x-axis) as determined by linear mixed-effects model. Center bar represents median; gray box bounded by upper/lower interquartile ranges (IQR); whiskers represent range; dots represent outliers; color-filled areas are representative of density/distribution of diversity values. **, p < 0.01; ***, p < 0.001. (B) Principal coordinate analysis (PCoA) plot of the weighted UniFrac distances colored by CG (color key shown).
3. Bacteriophages Comprise the Majority of the FRT DNA Virome
While the FRT bacteriome has been well-studied, the FRT virome, especially bacteriophage populations, is relatively unknown. Therefore, the FRT DNA virome was characterized using a subset of baseline samples by enriching for VLPs from resuspended vaginal swabs and extracting viral nucleic acid
[47][21]. Libraries were constructed and sequenced using the Illumina NovaSeq platform for a subset of 38 baseline samples, 14 of which were BV-negative and 24 were BV-positive. The resulting viral sequences underwent quality control, removal of bacterial and human reads, and then were assigned to known viral taxa using VirusSeeker, a BLAST-based NGS virome analysis pipeline
[40][22]. On average, there were 29 million reads per sample, 86.8% of them being high quality, with approximately 58% unique reads per sample.
The DNA eukaryotic virome was comprised almost entirely of
Papillomaviridae (94% of eukaryotic viral reads).
Herpesviridae (including herpes simplex virus, cytomegalovirus, and Epstein Barr virus) comprised 5% of viral reads, while
Polyomaviridae, Poxiviridae (mulluscum contagiosum virus), and
Anelloviridae combined were less than 1%. There was no significant difference in reads assigned to DNA eukaryotic viruses by BV or HIV status after correction for multiple comparisons. However, FRT bacteriophages were abundant, representing 83% of virally assigned reads. Samples contained sequences identified as belonging to the
Myoviridae, Siphoviridae, Podoviridae, Inoviridae, Ackermannviridae, Microviridae, Lipothrixviridae, Plasmaviridae, and
Tectiviridae bacteriophage families. Sequences assigned to members of the
Caudovirales order, lytic-tailed dsDNA bacteriophages, including
Myoviridae, Siphoviridae, and
Podoviridae, were the most abundant in all samples regardless of BV, HAART, or HIV status.
Bacteriophages within the FRT are clustered into two distinct, novel bacteriophage community groups based on composition and abundance that were named viral state types (VSTs;
Figure 3A). VST1 represented 44.7% (
n = 17) of all samples while VST2 contained the remaining 55.3% (
n = 21). Bacteriophage Shannon diversity differed between VSTs (
Figure 3B), with VST2 having the highest diversity bacteriophage populations. The VSTs were also grouped distinctly by beta diversity analysis (
Figure 3C; PERMANOVA
p = 0.0001). Neither VST exhibited a dominant bacteriophage member. VST1 contained several samples with a high relative abundance of
Rhodococcus viruses,
Spounavirinae,
phi29 virus and
Biseptimavirus-assigned bacteriophage reads. VST2 composition was more evenly distributed. This is the first study to identify bacteriophage community groups in the FRT.
Figure 3. FRT DNA bacteriophages cluster into two unique community groups. Self-collected vaginal swabs were processed for DNA virome analysis by enriching for viral nucleic acid, libraries built and sequenced. (A) Relative abundance of the 32 most frequent bacteriophage species (y-axis) by sample (x-axis). Ward’s linkage hierarchical clustering analysis was used to cluster samples into distinct bacteriophage community profiles called viral state types (VSTs). VST, bacterial vaginosis (BV) status, highly active antiretroviral therapy (HAART) status, and human immunodeficiency virus (HIV) status (color key) are shown. Percent abundance is indicated by gradient key. (B) Bacteriophage Shannon diversity (y-axis) by VST (x-axis) as determined by linear regression model. Center bar represents median; gray box is bounded by upper/lower interquartile ranges (IQR); whiskers represent range; dots represent outliers; color-filled areas are representative of density/distribution of diversity values. ***, p < 0.001. (C) Principal coordinate analysis (PCoA) plots of beta diversity distances, as determined by permutational multivariate analysis of variance, colored by VST.
4. Transkingdom Associations within the FRT Microbiome of South African Women
Bacteriophages can directly impact bacterial composition and abundance through infection of their host. Therefore, the transkingdom associations between bacteriophage and bacteria in the FRT were investigated. The VSTs significantly correlated with bacterial CG (
p = 0.00015;
Figure 3A), with VST1 associating with the
Lactobacillus-dominant CG1 and 2, while VST2 associated with CG3 and 4, both higher diversity CG. CG5 contained samples belonging to both VSTs (
Figure 3A). These data indicate a strong association between bacteriophage communities and the host bacterial populations.
The transkingdom interplay between bacteriophage and bacteria in the FRT was further supported by a significant correlation identified between bacteriophage and bacterial alpha diversity (
p = 0.00032). To further investigate specific bacteriophage-bacterial interactions, correlations between FRT bacterial and bacteriophage composition were identified by Kendell’s rank correlation coefficient. Reads assigned to bacteriophages
Bacillus virus Camphawk and
Bacillus virus Pony, which infect members of the
Bacillus genus, positively correlated with the BV-associated bacteria
Gardnerella, A. vaginae, Prevotella, Sneathia, and
Dialister (
Figure 4). Reads assigned to unclassified bacteriophages of the
E125 genus are also positively associated with
Gardnerella and
A. vaginae (
Figure 4). A number of
Bacillus-infecting bacteriophages, including
Bacillus virus Pony and Bacillus virus Staley, were inversely associated with bacteria protective from BV, particularly
L. iners (
Figure 4). Interestingly, in this cohort, bacteriophage associations revealed that
Veillonella more closely grouped with
Lactobacillus rather than BV-associated bacteria despite the known role of
Veillonella in lactose fermentation and higher diversity FRT microbiomes
[8]. These data suggest that bacteriophages directly or indirectly interact with FRT bacterial populations in disease states and identify putative FRT bacteriophage-host networks that may play a role in the development and maintenance of BV.
Figure 4. Transkingdom associations within the FRT of South African women. Heatmap of estimated Kendall’s correlation coefficients between FRT bacterial taxa (x-axis) and sequences assigned to bacteriophage (y-axis). Asterix indicate significant correlations after multiple comparisons correction by the Benjamini–Hochberg procedure, *, p < 0.05; **, p < 0.01. Magnitude and sign of the Kendall’s rank correlation coefficient are indicated by gradient key. Red indicates positive correlations; blue indicates negative correlations.
5. Effects of Bacterial Vaginosis on the FRT Virome
Next, the impact of different disease states on this cohort was examined. BV is a clinically significant condition with high morbidity characterized by high bacterial diversity
[13,14][13][14]. Data showed significant associations between high diversity bacterial CGs and VSTs, and further suggested specific bacteriophage interactions with BV-associated bacteria. Similar to published cohorts
[2], BV in this cohort was positively associated with increased bacterial alpha diversity compared to healthy subjects (
p = 0.0001). The high diversity CG3–5 (
p = 0.0001) also correlated positively with BV. Bacterial taxa, including
Gardnerella, Prevotella, Sneathia, and
Megasphaera (
Figure 5A), varied significantly by clinical BV status, corroborating distinct bacterial signatures in BV. Identification of specific bacterial taxa associated with recovery and transition to BV that could be predictive of the dynamic changes in bacteriome profiles between health and BV were sought. A significant increase in relative abundance of the Gram-positive, facultatively anaerobic cocci
Aerococcus (
p = 0.00216) and
Gemella (
p = 0.00746) among participants who recovered from or transitioned to BV was observed, suggesting these may be useful clinical biomarkers of impending change in BV status. While these bacteria are less frequently detected in BV
[56][23], they may represent regulators of FRT bacteriome structure in health and BV.
Figure 5. Discriminant FRT bacterial and bacteriophage species associated with bacterial vaginosis. (A) Significantly discriminant bacterial taxa by BV status were determined by univariate analysis using mixed-effects models. Relative abundance of each taxon (y-axis) is shown in BV-negative (orange) and BV-positive subjects (green; x-axis). All bacterial taxa presented are significant with an FDR-adjusted p < 0.05. (B) Bacteriophage Shannon diversity (y-axis) by BV status (x-axis) as determined by a linear regression model. *, p < 0.05. (C) Grouped bar plot showing the number of identified lytic and lysogenic bacteriophage contigs per sample by BV status. Lytic genes are represented as black triangles and lysogenic genes as black circles. Significance assessed using Mann–Whitney test with multiple comparisons correction by the Benjamini–Hochberg procedure. ***, p < 0.001; ****, p < 0.0001. (D) Significantly discriminant bacteriophage species by clinical BV diagnosis was determined by univariate analysis using a linear regression model. All bacteriophage taxa presented are significant with an FDR-adjusted p < 0.05.
Since clinical BV was associated with significant changes in FRT bacterial composition and correlated with bacterial CGs, and bacteriophage VSTs also correlated with bacterial CGs, correlations between bacteriophage populations and clinical BV were determined. Regression analysis accounting for HIV status and VST revealed bacteriophage Shannon diversity significantly differed by clinical BV diagnosis (
Figure 5B;
p = 0.0193). To further investigate the manner in which bacteriophage could be affecting bacterial populations in BV, assembled contigs were assessed for the presence of lytic or lysogenic bacteriophage genes to determine bacteriophage lifestyle (
Figure 5C). BV-positive samples had significantly more contigs per sample containing lytic (
p = 0.000283) and lysogenic (
p = 0.000283) genes compared to BV-negative samples (
Figure 5C). Moreover, in both BV-positive and BV-negative participants, there were considerably more contigs containing genes associated with a lysogenic lifecycle than those identified as lytic (
Figure 5C). Specific bacteriophage taxa differentially abundant in the FRT of BV-positive and -negative women were then ascertained.
Bacillus-infecting bacteriophages are known to belong to the
Herelleviridae and
Podoviridae families
[57,58][24][25]. Members of these families, including
Bacillus virus Camphawk and
Bacillus virus Pony (
Figure 5D), were significantly associated with BV diagnosis (
p= 0.019058 and
p = 0.014546, respectively). BV diagnosis also strongly correlated with the bacteriophages
Escherichia virus FV3 and
unclassified E125 virus (
Figure 5D), which are known to infect the BV-associated bacteria
Escherichia coli and
Burkholderia, respectively
[59,60][26][27]. Together, these data uncover a link between highly diverse FRT bacteriophage populations, a distinct subset of bacterial hosts, and BV.
6. The Effect of HIV and HPV on the FRT Microbiome
The effect of HIV on the FRT microbiome was also examined in this cohort. No significant difference in bacterial richness, alpha or beta diversity between HIV-positive and HIV-negative subjects was found. Further, there were no significant alterations in bacteriome or bacteriophage diversity by HIV status. Thus, FRT bacterial and bacteriophage communities were not detectably altered by HIV infection in this cohort, suggesting that localized infections may play a more important role in the FRT composition than systemic infections. The relationship between HPV, the main eukaryotic virus found, and bacterial populations were also assessed. Upon examination of 63 HPV-positive and 37 HPV-negative baseline samples, there were no significant associations between HPV infection and bacteriophage or bacterial diversity. Analysis of HPV subtypes revealed increased bacterial alpha diversity in HPV6-positive subjects (p = 0.0181), suggesting certain HPV subtypes may directly or indirectly benefit from the presence of higher diversity bacterial populations, although this analysis may have been underpowered for less prevalent subtypes.