Cobined Genomic and Metabolomic Screening: Comparison
Please note this is a comparison between Version 1 by Janina Krause and Version 2 by Conner Chen.

Since the golden age of antibiotics in the 1950s and 1960s actinomycetes have been the most prolific source for bioactive natural products. However, the number of discoveries of new bioactive compounds decreases since decades. New procedures (e.g., activating strategies or innovative fermentation techniques) were developed to enhance the productivity of actinomycetes. Nevertheless, compound identification remains challenging among others due to high rediscovery rates. Rapid and cheap genome sequencing as well as the advent of bioinformatical analysis tools for biosynthetic gene cluster identification in combination with mass spectrometry-based molecular networking facilitated the tedious process of dereplication. In recent years several studies have been dedicated to accessing the biosynthetic potential of Actinomyces species, especially streptomycetes, by using integrated genomic and metabolomic screening in order to boost the discovery rate of new antibiotics. 

  • bioactive natural products
  • actinomycetes
  • genome mining
  • biosynthetic gene cluster
  • dereplication
  • molecular networking

1. Discovery of New Analogs

Integrated genomic and metabolomic screening is a powerful tool for the detection of new derivatives of known compounds.
Liu et al. [1][27] investigated the genome and metabolome of the known daptomycin-producer Streptomyces roseosporus in search for nonribosomal peptide (NRP) antibiotics. They searched for specific peptidic signatures, that is masses of amino acid fragments, in the molecular network of the strain’s metabolome. This way, they discovered subnodes of the daptomycin cluster, which correspond to daptomycin variants missing the N-terminal lipid chain and tryptophan. Additionally, sodium- and potassium- adducts of arylomycin [1][27], a lipohexapeptide with antibiotic activity [2][28], were discovered. The authors especially emphasized the existence of several arylomycin intermediates, which lack the typical biaryl linkage and a tryptophan residue at the C-terminus. In an antiSMASH cluster search for Non-ribosomal perptide synthase (NRPS) BGCs, those coding for the productive pathway of daptomycin and arylomycin could be assigned to the produced antibiotics. The production of another known natural product [1][27], the Pseudomonas-active peptidylnucleoside napsamycin (Table 1) [3][29], was suggested by genome analysis and confirmed by the molecular network. Higher molecular weight variants seemed to be produced by S. roseosporus as well. Before it was unknown that S. roseosporus was able to produce napsamycins at all. The focus of this study lay on stenothricin, a barely characterized antibiotic that was discovered in 1974. So far, four analogs of stenothricin were known [4][30] and could be detected in the metabolome of S. roseosporus. A corresponding BGC, which matches the metabolomic information, could be identified in an antiSMASH analysis. Additionally, the existence of analogs of stenothricin were indicated by the molecular network (Table 1). These analogs differ in the length of the lipid side chains and amino acid substitution. Also, hydrolysed and glycosylated products could be identified. Liu et al. verified activity of the extracted stenothricin variant mixture against Gram-negative and Gram-positive bacteria [1][27].
Table 1. List of newly discovered compounds from actinomycetes via combined genomic and metabolomic screening.
Compound Strain Activity Reference
napsamycin analogs

stenothricin analogs
Streptomyces roseosporus antibiotic [1][27]
spiroindimicins E and F

lagunapyrones D and E Streptomyces sp. MP131-18 none [5][31]
strecacansamycin A, B, C Streptomyces cacaoi subsp. asoensis antiproliferative [6][25]
valinomycin derivatives Streptomyces sp. CBMAI 2042 antibiotic [7][26]
cyclomarin analogs

arenicolide analogs

retimycin A Salinispora sp. suggested antiproliferative [8][24]
This study demonstrates that it is worth investigating old strains with new methods, such as integrated genomic and metabolomic screening. As shown here, such an approach can result in an extended production profile of the known strain as well as point out bioactivities which have not been observed before.
In a study of Duncan et al. [8][24], 30 isolates of Salinispora strains were examined via combined genomic and metabolomic screening, also in order to detect new analogs. The species of the strains were identified as Salinispora arenicola, Salinispora tropica and Salinispora pacifica. Like Streptomyces, Salinispora belongs to the order of actinomycetes [9][32]. For several identified BGCs in the genome of the strains, no corresponding compound could be associated indicating that these clusters are not expressed under the cultured conditions or that the product is not detectable with the applied method. The raw extracts of the cultures were submitted to HR-MS/MS and a molecular network was calculated from the resulting ions. To identify the microbial products the MS-spectra were compared to a library of standard Salinispora biosynthetic products [8][24]. Besides the known compounds cyclomarin A [10][33] and D [11][34] (Table 1), the existence of a methylated, a demethylated and a hydrated analog is implied by MS-data. Also, a hydroxylated, dehydrogenated and a methylated version [8][24] of arenicolide A [12][35] could be detected (Table 1). Eight of the detected compounds could be linked to Salinispora-BGCs, among them the so far unidentified BGC of arenicolide A. The BGC could be associated with the production of arenicolide A as, on the one hand, this BGC was present exclusively in arenicolide producing strains and, on the other hand, the annotated ketosynthase-domains correspond to the sequence of the enzyme required for arenicolide production. Another BGC could be linked bioinformatically to a compound newly identified in this study, named retimycin A, an analog of the quinomycins [8][24], which display antiproliferative activity [13][36] (Table 1). Here, also the singleness of the parent ion as well as of the BGC in one single strain indicated the match. Additionally, similarity to a BGC of a known quinomycin-like compound was observed with variations in an adenylation-domain and a domain coding for an oxygenase. The MS/MS analysis indicated the presence of an oxidized, methylated thioacetal-version of retimycin A. This functional group has not been observed before for quinomycins [8][24]. So far, no bioactivity testing has been performed on retimycin A.
One advantage of genomic and metabolomic screening over classical bioactivity-based screening is the possibility to detect inactive analogs. Paulus et al. [5][31] investigated the biosynthetic potential of strain MP131-18, which was sampled at a Norwegian fjord. 16S phylogeny showed highest similarity to Streptomyces specialis and Streptomyces avicenniae. Culture extracts displayed activity against Bacillus subtilis and Pseudomonas putida. Analysis with antiSMASH revealed 36 gene clusters. To identify the corresponding metabolites HR-LC-QTOF MS was performed followed by dereplication with the Dictionary of Natural Products (DNP) database. This way, the production of the bisindole pyrrole antibiotics lynamicins A to G [14][37] and spiroindimicins B and C [15][38] could be confirmed. Known biosynthetic genes from the production of bisindole pyrroles were identified in BGC 36, so the cluster could be associated with the production of these compounds. Additionally, analogs of the known secondary metabolites (named lynamicin H, spiroindimicin E and spiroindimicin F; Table 1) could be identified in the culture extract [5][31]. Besides these, the polyketides lagunapyrones A-C [16][39] accumulated. Genes coding for type I and type III Polyketidesynthase (PKS) were present in BGC 3. This indicated that this cluster is responsible for lagunapyrone-production, as lagunapyrone is a polyketide. As for this type III PKS high flexibility in the choice of the acyl-CoA unit was predicted. In the following, the production of two more lagunapyrones D and E (Table 1) with C2 and C5 acyl-chains, of which masses could be found in the culture extract, was indeed confirmed [5][31]. The new compounds did not display antimicrobial activity with the exception of spiroindomicin B, which showed moderate activity against Bacillus subtilis [5][31]. Due to their lack of bioactivity, the analogs would not have been detected by classical activity-based screening of HPLC-fractions.

2. Exploring the Productive Spectrum

Integrated genomic and metabolomic screening can be used to explore the productive capability of strains and prioritize those strains, which display the highest amount of uncharacterized genomic and metabolomic entities.
Ishaque et al. [17][40] investigated the crude extract of a novel Streptomyces isolate named Streptomyces tendae VITAKN. The culture extract showed quorum sensing inhibition (QSI), thus the group aimed to identify the compound responsible for this effect. Hence, the group performed a whole genome analysis with antiSMASH, in which 33 BGCs could be detected. Only nine of these clusters showed more than 75% similarity to those deposited in the antiSMASH database. The remaining clusters were suspected to code for the production of so far unknown chemical entities. The crude extract was examined via LC-HRMS and LC-HRMS/MS. This resulted in a molecular network consisting of 327 nodes of which four correlated to the spectra of cyclic dipeptides (2,5-diektopiperazines) [17][40]. Cyclic dipeptides act as LuxR-type activators or inhibitors and exhibit antiproliferative, antibiotic and anti-inflammatory activity [18][41]. The genes coding for the key enzymes for the formation of 2,5-diektopiperazines, CDPS [19][42], were identified in the genome. A comparison of the spectra with data from the GNPS-MassIVE database did not result in exact matches. This indicates that the wanted compound has not been characterized before [17][40].
Nevertheless, the compound with QSI activity could not be identified via this combined genome and metabolomic screening approach. But as no corresponding MS-data for the detected parent ions could be found, an uncharacterized compound with QSI-activity is likely produced by S. tendae VITAKN. The existence of unexplored chemical entities also predestines the metabolome of S. tendae VITAKN as an object of further investigation [17][40].
A detailed screening of both, genome and metabolome, can not only help to estimate the amount of unknown natural products but also to elucidate the full biosynthetic potential of putative producer strains. Streptomyces clavuligerus, Streptomyces jumonjinensis, and Streptomyces katsurahamanus [20][23] are all known to produce the β-lactamase inhibitor clavulanic acid [21][43]. The following study by AbuSara et al. [20][23] aimed to examine if the three species produce other secondary metabolites in common. Therefore, a comparative analysis of the metabolome as well as of the genome was performed. Via LC-MS and LC-MS/MS analysis, it was observed that all three species produce desferrioxamines [22][44] and ectoine [23][45], which are very common in streptomycetes as these metabolites are required for general cell functions. The antibiotics holomycin [24][46] and thiolutin [25][47] are exclusively produced by S. clavuligerus [20][23]. In this study, the production of the antiproliferative nucleoside pentostatin [26][48] could be reported for the first time in S. clavuligerus, though the corresponding BCG had already been discovered in this strain. However, no production or BGC of pentostatin could be detected in S. jumonjinensis or S. katsurahamanus [20][23]. Naringenin is a flavonoid previously only known from plants [27][49] but was found here to be produced in all three Streptomyces species. The same is true for the plant-associated monoterpenes carveol and cuminyl alcohol. The terpene hydroxyvalerenic acid could be found in metabolome of S. clavuligerus exclusively. To elucidate the corresponding BGCs, an analysis with antiSMASH was performed. This way, 49 BGCs could be detected in the genomes of S. jumonjinensis and S. katsurahamanus, of which 44 could be associated with known clusters. Terpene like BGCs were observed in all three species, which could be the corresponding BGCs to the above-mentioned plant-derived metabolites. Besides clavulanic acid, S. clavuligerus is also able to produce its analog 5S clavam [20][23], which displays no inhibition of β-lactamases due to the inversed stereochemistry, and another unknown paralog of clavulanic acid [28][50]. While S. jumonjinensis and S. katsurahamanus are producers of clavulanic acid as well, no production of 5S clavam or the paralog could be detected. This is reflected in the genomes of the producer strains, which lack the according BGCs [20][23]. All three species are capable of producing cephalosporin C [29][51], which is linked to the production of clavulanic acid in S. clavuligerus [30][52]. The corresponding BGCs of S. jumonjinensis and S. katsurahamanus lack one gene, blp, of the cephalosporin-BGC in contrast to S. clavuligerus, which indicates that this gene is not essential for the production of cephalosporin C [20][23].

3. Elucidation of Biosynthesis

Not only can new compounds be detected via combined genomic and metabolomic screening, it is also possible to identify their biosynthetic origin. For a study by Paulo et al. [31][57]Streptomyces sp. CBMAI 2042 was isolated from Citrus sinensis branches. This strain inhibited growth of Bacillus megaterium, Staphylococcus aureus and Candida albicans and suppressed the proliferation of Citrus pathogens. Whole genome sequencing followed by an analysis with antiSMASH revealed 35 BGCs, among them the NRPS-cluster for the depsipeptide valinomycin. Valinomycin has proven antibacterial [32][58], antiviral [33][59] and antiproliferative [34][60] properties. Via UHPLC-QTOF-MS/MS the metabolite itself and its ammonium adduct were detected and the identities verified via matching with the GNPS database. Additionally, the bioinformatic tools DEREPLICATOR and VarQuest, which are specialized in the detection of peptide natural products, were used. Both, valinomycin and its ammonium adduct, appeared as distinct nodes in the molecular network. Besides valinomycin, Streptomyces sp. CBMAI 2042 was able to produce the cyclic depsipeptide montanastatin [31][57], which also displays antiproliferative activity [32][58]. Though montanastatin has been known before, the biosynthetic origin has never been elucidated. But according to its structure, montanastatin could stem from the same cluster as valinomycin. Thus, the authors expressed the cluster for valinomycin in S. coelicolor as heterologous host and performed a metabolomic analysis. Montanastatin was detected as well as valinomycin and five so far unknown valinomycin-analogs (Table 1). This demonstrates, that all mentioned metabolites originate from the same NRPS-cluster.
Here, metabolomic and genomic approaches were combined in order to elucidate the common biosynthetic origin of valinomycin and montanastatin. Nevertheless, heterologous expression for confirmation remained indispensable.
Liu et al. [6][25] also aimed to elucidate biosynthetic details by using combinatorial genetic and metabolomic screening to determine the absolute configuration of the stereochemical centers of the newly isolated strecacansamycins A, B and C (Table 1), which are produced by Streptomyces cacaoi subsp. asoensis H2S5. Strecacansamycins belong to the class of aliphatically bridged aromatic ansamycins [35][61]. In activity tests in vivo against PC-3, HepG2, and U87-MG cells, respectively, the isolated analogs displayed antiproliferative properties [6][25]. LC-HR-ESI-MS data of the culture extract were evaluated with GNPS. This way, nodes for ansamycin-analogs were detected. MS data, however, give no information about the absolute stereochemistry of a molecule. So, Liu et al. used the genetic information to reconstruct the production pipeline and derive which stereochemistry would be provided by the modules. For this purpose, the whole genome was sequenced and analyzed with antiSMASH. The analysis revealed 31 BGCs. A type I PKS-NRPS hybrid cluster is probably responsible for the production of strecacansamycins. PKSs and NRPSs are composed of several biosynthetic units called modules, which contain a set of catalytically active domains. The type I PKS-NRPS hybrid cluster contains acyltransferase-domains in module 4, which are stereoselective for S-methylmalonyl-CoA, but stereoinversion occurs in the subsequent condensation reaction catalyzed by a ketosynthase-domain so the final configuration at C-12 is R. The configuration of the methoxy or hydroxyl-groups at C-3, C-11 and C-13 could be determined as R, S and R, based on the direction of the hydride-addition at the ketoreductase-domains. One ketoreductase-domain type is also responsible for the formation of cis or trans double bonds depending on the direction of the reduced hydroxy-group. This way, it could be deduced that the double bonds at C-5, C-7, and C-9 exhibit trans- and those at C-15 cis-configuration [6][25].