You're using an outdated browser. Please upgrade to a modern browser for the best experience.
Computational Approaches to Enzyme Inhibition by Natural Products
Edit

The exploration of biologically relevant chemical space for the discovery of small bioactive molecules present in marine organisms has led not only to important advances in certain therapeutic areas, but also to a better understanding of many life processes. The still largely untapped reservoir of countless metabolites that play biological roles in marine invertebrates and microorganisms opens new avenues and poses new challenges for research. Computational technologies provide the means to (i) organize chemical and biological information in easily searchable and hyperlinked databases and knowledgebases; (ii) carry out cheminformatic analyses on natural products; (iii) mine microbial genomes for known and cryptic biosynthetic pathways; (iv) explore global networks that connect active compounds to their targets (often including enzymes); (v) solve structures of ligands, targets, and their respective complexes using X-ray crystallography and NMR techniques, thus enabling virtual screening and structure-based drug design; and (vi) build molecular models to simulate ligand binding and understand mechanisms of action in atomic detail.

enzyme inhibitors databases cheminformatics

1. Bibliographical Sources and Virtual NP Databases

Chemical libraries encompassing millions of compounds include the Chemical Abstracts Service (CAS) REGISTRY database (http://www.cas.org/expertise/cascontent/registry/index.html, accessed on 20 December 2022), which is updated on a daily basis and contains >250,000 NPs out of >150 million chemical substances, PubChem (including PCSubstance, PCCompound, and PCBioAssay) [1], ChEMBL (a manually curated database of >2,300,000 bioactive molecules with drug-like properties, last update July 2022) [2], and ChemSpider (with various levels of partial to complete stereochemistry) [3]. The free-to-access resource DrugBank is a web-enabled database (https://go.drugbank.com/, accessed on 20 December 2022) that incorporates comprehensive molecular information about drugs, their mechanisms, their interactions, and their targets. First described in 2006 as a knowledgebase for drugs, drug actions, and drug targets [4], DrugBank has evolved over time in response to improvements in web standards and changing needs for drug research and development. The latest update, DrugBank 5.0 [5], was expanded to cover not only drug binding data, numerous investigational drugs, drug-drug and drug-food interactions, and SNP-associated drug effects, but also information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics), and protein expression levels (pharmacoproteomics). Enzyme inhibitors (DBCAT000003) are described as “compounds or agents that combine with an enzyme in such a manner as to prevent the normal substrate-enzyme combination and the catalytic reaction”.
Reviews on MNPs have been published on a regular basis in the scientific literature [6][7][8]. The renewed upsurge of interest in NPs, and MNPs in particular, over the last two decades has led to a rapid multiplication of databases in both the private sector and the public domain that compile general-purpose or thematic information on these naturally occurring compounds, often incorporating supplementary material published in scientific papers. A dedicated, searchable, and continuously updated database (MarinLit, https://marinlit.rsc.org/, accessed on 20 December 2022) that was established in the 1970s by Prof. John Blunt and Prof. Murray Munro (University of Canterbury, New Zealand) has been maintained by the Royal Society of Chemistry (UK) since 2014. MarinLit covers ~40,000 compounds from marine macro- and microorganisms and about the same number of references to journal articles. Among the specialized MNP databases, the Dictionary of Marine Natural Products (DMNP) [9] appeared as the first of its kind in 2008 and encompassed a subset of data from the Dictionary of Natural Products (DNP, one of several Chapman & Hall chemical dictionaries) based on the biological source of the compounds. DMNP was marketed as a book together with a CD-ROM for a desktop version, and the searchable web-based version CHEMnetBASE (https://dmnp.chemnetbase.com/, accessed on 20 December 2022) is still available (v. 31.1; updated in 2022), but only to subscribing institutions.
Virtual chemical libraries of NPs can be categorized into (i) encyclopedic and general NP databases; (ii) special subsets within fully enumerated, ultra-large scale chemical libraries specifically built to facilitate VS campaigns, e.g., ZINC [10][11]; (iii) compound collections enriched with NPs used in traditional medicines; and (iv) specialized databases focused on specific habitats, geographical regions, organisms, biological activities, or even specific NP classes. Unfortunately, many NP databases belonging to the latter two categories are rather ephemeral or rapidly become either outdated or unavailable to the scientific community [12], and the same criticism applies to many bioinformatics web services related to NPs [13]. This is most likely due to (i) a lack of funds (and/or human resources) for their sustained management and continuous upgrading, and (ii) the current overwhelming “data deluge”. For these reasons, there is an urgent need for nonredundant, community-wide efforts that optimize the use of contemporary bioinformatic and chemoinformatic capabilities, as exemplified by the recently established open platform LOTUS (https://lotus.naturalproducts.net, accessed on 20 December 2022), a knowledgebase that is expected to have strong transformative potential for research on NPs and beyond [14]. In this praiseworthy initiative, data sharing within the Wikidata framework broadens interoperability and facilitates access to >750,000 referenced structure-organism pairs.
Another large and freely available NP database is Super Natural II (https://bioinf-applied.charite.de/supernatural_new/index.php; last updated: October 2022, accessed on 20 December 2022), which provides two-dimensional (2D) structures and physicochemical properties for ~326,000 molecules, as well as information about the pathways associated to their synthesis, degradation, and mechanisms of action with respect to structurally similar drugs [15]. An additional recent compilation of 400,000 non-redundant NPs was made available in 2021 [16] as the open-access COlleCtion of Open NatUral producTs (COCONUT, https://coconut.naturalproducts.net/, accessed on 20 December 2022).
One important goal of these NP databases is to facilitate a quick assessment of novelty for any newly identified compound in a natural extract. To distinguish between known and unknown compounds, it is important to have rapid and trustworthy “dereplication” methods, which rely heavily on the interpretation of molecular mass and molecular formula, as well as UV and NMR spectral data [17]. Nevertheless, the dereplication process can be problematic sometimes because (i) the present validity and accuracy of the collected information is only as good as that of the original data source; and (ii) stereochemical information on NPs is often inaccurate or incomplete. In the field of MNPs alone, it was recently reported that more than 200 structures were misassigned in the last ten years only [18]. A comparative analysis of the original and the revised structures revealed that major pitfalls still plague the structural elucidation of small molecules and, consequently, that quite a few 3D molecular structures present in databases may be inaccurate. This finding emphasizes the roles of total synthesis, X-ray crystallography, as well as chemical and biosynthetic logic, to complement spectroscopic data. Nevertheless, it is noteworthy that a much lower incidence of “impossible” structures was found in MNPs compared to NPs of plant origin.
The utilization of computer-assisted structure elucidation (CASE) programs can minimize the risk of misassignment and help identify truly novel compounds (the “unknown unknowns”) [19] by generating all structures that are consistent with key data from 2D correlation spectroscopy (COSY), heteronuclear multiple bond correlation (HMBC), and 1,1-adequate sensitivity double-quantum spectroscopy (ADEQUATE) NMR experiments, and by ranking the resulting structures in order of probability. The algorithms may additionally benefit from both stereospecific NMR data and use of optimized geometries and predicted chemical shifts provided by density funtional theory (DFT) quantum mechanical calculations [20]. The absolute configuration of an MNP can be unequivocally confirmed by crystallographic analysis and, in the case of noncrystalline compounds containing a pseudo-meso core structure that results in a specific rotation ([a]D) of almost zero (e.g., elatenyne), it may be necessary to absorb the compound into a porous coordination network (a “crystalline sponge”) [21].
The exploration of the identities and biological activities of metabolites present in complex mixtures has benefited enormously in recent years from scalable native and functional metabolomics approaches [22]. Novel techniques, such as affinity selection mass spectrometry (MS), complemented with pulsed ultrafiltration, size exclusion chromatography, and magnetic microbead affinity selection screening, now allow the separation of non-covalent ligand-receptor complexes from other nonbinding compounds [23].
Recognizing the need for community-wide platforms to effectively share and analyze raw, processed, or identified tandem MS (MS/MS or MS2) data of NPs, in an analogous fashion to what has been achieved in genomics and proteomics research with the GenBank® at the National Center for Biotechnology Information (NCBI) [24] and the UniProtKB [25], the open-access knowledgebase known as Global Natural Products Social Molecular Networking (GNPS, http://gnps.ucsd.edu, accessed on 20 December 2022) was presented in 2017 [26]. The spectral libraries enable unambiguous dereplication (by matching spectral features of the unknown compound(s) to curated spectral databases of reference compounds, i.e., identification of “known unknowns”) [19], variable dereplication (approximate matches to spectra of related molecules), and the identification of spectra in molecular networks. Importantly, GNPS allows for the community-driven, iterative re-annotation of reference MS/MS spectra in a wiki-like fashion, and therefore it will contribute to library improvements and eventual convergence of all curated MS/MS spectra. The visualization of molecular networks in GNPS represents each spectrum as a node, and spectrum-to-spectrum alignments as edges (connections) between nodes.

2. Linking Chemical Diversity of Secondary Metabolites to Biosynthetic Gene Clusters

Secondary metabolites can be considered genetically encoded small molecules that play a variety of roles in cell biology and therefore have the potential to become chemical probes or drug leads. Their identification and characterization can benefit from a growing number of databases and genomics-based computational tools that have been compiled and hyperlinked at the Secondary Metabolite Bioinformatics Portal (SMBP (http://www.secondarymetabolites.org/, accessed on 20 December 2022) website [27]. Inherent limitations related to their low production and difficult detection, and also high rediscovery rates, can be addressed, at least in part, by searching for BGCs in genomic data and unveiling their (sometimes cryptic) metabolic potential [28]. However, the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis, hence the need for new bioinformatic tools. An example is the Natural Product Domain Seeker (NaPDoS) web service (https://npdomainseeker.sdsc.edu/napdos2/, accessed on 20 December 2022), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from genes encoding PKS and NRPS, respectively. The sequence tags correspond to PKS-derived KS domains and NRPS-derived condensation (C) domains and are compared to an internal database of experimentally characterized biosynthetic genes, so that genes associated with uncharacterized biochemistry can be identified [29]. The latest update (NaPDoS2) greatly expands the taxonomic and functional diversity represented in the webtool database and allows larger datasets to be analyzed. Importantly, NaPDoS2 can be used to detect genes involved in the biosynthesis of specific structural classes or new biosynthetic mechanisms, and also to predict biosynthetic potential [30].
The key role of marine microbial symbionts of invertebrates in MNP biosynthesis has been increasingly recognized [31] and “genome mining” (i.e., the exploitation of genomic information for the discovery of biosynthetic pathways) [32] provides unique opportunities for (i) the identification of yet undisclosed specialized metabolites [33] and their chemical variants [32]; (ii) the genetic engineering of BGCs to obtain novel “unnatural” NPs [34]; and (iii) the heterologous expression of secondary metabolic pathways that remain silent or are poorly expressed in the absence of a specific trigger or elicitor [35]. In fact, the results of a variety of genome sequencing projects have unveiled the metabolic diversity of microorganisms (which may be overlooked under standard fermentation and detection conditions) and their tremendous biosynthetic potential. Furthermore, studies on the evolutionary history of BGCs in relation to that of the bacteria harboring them (“comparative genomics”) beautifully illustrate the mechanisms by which chemical diversity is created in nature and how some NPs represent ecotype-defining traits while others appear selectively neutral [36].
Novel algorithms have been devised to systematically identify BGCs in microbial genomic sequences [32][37][38]. A network analysis of the predicted BGCs in Proteobacteria (aka Pseudomonadota, a major phylum of Gram-negative bacteria) has revealed large gene cluster families, and the experimental characterization of the most prominent one revealed two subfamilies consisting of hundreds of BGCs encoding the biochemical machinery for the synthesis of a series of remarkably conserved lipids with an aryl head group conjugated to a polyene tail (i.e., aryl polyenes) that are likely to play important roles in Gram-negative cell biology [39]. The systematic study of BGCs in Actinobacteria (actinomycetes mainly associated to sponges in marine habitats) is complicated by numerous repetitive motifs. By combining several metrics, a method for the global classification of these gene clusters into families (GCFs) has been developed, and the biosynthetic capacity of the resulting GCF network has been validated in hundreds of strains by correlating confident MS detection of known NPs with the presence or absence of their established BGCs [40].
The Minimum Information about a Biosynthetic Gene cluster (MIBiG, https://mibig.secondarymetabolites.org/, accessed on 20 December 2022) specification is a data standard that facilitates the consistent and systematic deposition and retrieval of metadata on BGCs and their molecular products [41]. MIBiG is a Genomic Standards Consortium project that builds on the Minimum Information about any Sequence (MIxS) framework to (i) identify which genes are responsible for the biosynthesis of which chemical moieties, thus systematically connecting genes and chemistry; (ii) understand the natural genetic diversity of BGCs within their environmental and ecological context; and (iii) develop an evidence-based parts registry for engineering biosynthetic pathways and gene clusters through synthetic biology. The MIBiG standard contains dedicated class-specific checklists for gene clusters encoding pathways to produce alkaloids, saccharides, terpenes, polyketides, NRPs, and RiPPs [42].
Natural antimicrobial peptides (AMPs) have been found not only in marine fish [43][44] but also in marine invertebrates [45][46] as major components of their innate host defense systems. The Antimicrobial Peptide Database (APD, https://aps.unmc.edu/, accessed on 20 December 2022), online since 2003 and last updated in June 2022 [47], defines four unified classes of AMPs on the basis of the polypeptide chain’s connection patterns: (I) linear polypeptide chains (e.g., cathelicidins) [48]; (II) sidechain-linked peptides, such as disulfide-containing defensins and lantibiotics (i.e., lanthionine-containing antibiotics, e.g., microbisporicin, produced by the soil actinomycete Microbispora corallina [49] and mathermycin from the marine actinomycete Marinactinospora thermotolerans [50]); (III) polypeptide chains with side chain to backbone connection (e.g., bacterial lassos and fusaricidins); and (IV) circular peptides with a seamless backbone, i.e., N- and C-termini linked by a peptide bond (e.g., plant cyclotides and animal θ-defensins) [51]. The manually curated Database of Antimicrobial Activity and Structure of Peptides (DBAASP, http://dbaasp.org, accessed on 20 December 2022) provides detailed information (including chemical structure and activity against specific targets) on experimentally tested peptides (both natural and synthetic) that have shown antimicrobial activity as monomers, multimers, or multi-peptides [52]. The Collection of Antimicrobial Peptides (CAMP), CAMPSign, and ClassAMP are open-access resources that have been developed to advance the current understanding of AMPs, from N- and C-terminal modifications and the presence of unusual amino acids to 3D structures thorough family-specific signatures that facilitate AMP identification and classification as antibacterial, antifungal, or antiviral [53][54]. Synthetic AMPs are substantially enriched in residues with physicochemical properties known to be critical for antimicrobial activity, such as high α-helical propensity, positive charge, and hydrophobicity.
The Natural Products Atlas [55] was created as an open-access centralized knowledgebase encompassing ~25,000 microbially produced NPs using a combination of manual curation and automated data mining approaches, and was developed as a community-supported resource under findable, accessible, interoperable, and reusable (FAIR) [56] principles. It contains referenced data for molecular structure, source organism, isolation, total synthesis, and instances of structural reassignment for compounds of bacterial, fungal, and cyanobacterial origin. Its associated web interface (https://www.npatlas.org, v. 2.3.0, accessed on 20 December 2022) allows users to search by structure, substructure, and physical properties, as well as to explore the chemical space of these NPs from a variety of perspectives. The NP Atlas is integrated with other NP databases, including the MIBiG repository and the GNPS platform cited above. The NP Atlas was recently updated [57] and currently embodies (i) >32,000 compounds; (ii) a full RESTful (REST is an acronym for REpresentational State Transfer and an architectural style for distributed hypermedia systems) application programming interface (API); (iii) full taxonomic descriptions for all microbial taxa; (iv) integrated data from external resources, including CyanoMetDB (https://www.eawag.ch/en/department/uchem/projects/cyanometdb/, accessed on 20 December 2022), a comprehensive public database of secondary metabolites from cyanobacteria (aka “blue-green algae”) [58]; and (v) chemical ontology terms from both ClassyFire [59] (see below) and NPClassifier (a deep-learning tool for the automated structural classification of NPs from their counted Morgan fingerprints) [60].
Finally, more than seven terabases of metagenomic data from samples collected in epipelagic and mesopelagic water locations across the globe by the Tara (https://fondationtaraocean.org/en/foundation/, accessed on 20 December 2022) Oceans project have been used to generate an ocean microbial reference gene catalog (http://ocean-microbiome.embl.de/companion.html, accessed on 20 December 2022) with >40 million nonredundant sequences from viruses, prokaryotes, and picoeukaryotes. Remarkably, almost three quarters of ocean microbial core functionality is shared with the human gut microbiome, and epipelagic community composition was found to be mostly driven by water temperature rather than geography or any other environmental factor [61]. A more recent analysis of 214 metagenome-assembled genomes (MAGs) recovered from the polar seawater microbiomes revealed strains that are prevalent in the polar regions while nearly undetectable in temperate seawater [62].

3. Classification and Chemoinformatic Analyses of Natural Products

The long-established Gene Ontology (GO) resource [63][64] describes knowledge of the “universe” of biology with respect to (i) molecular functions, (ii) cellular locations, and (iii) biological processes of gene products, in terms of a dynamic, controlled vocabulary that can be applied to prokaryotes and eukaryotes, as well as to single and multicellular organisms. Along the same vein, a standardized and purely structure-based chemical ontology (ChemOnt) was recently developed to automatically assign over 77 million compounds to a taxonomy consisting of >4800 different categories by means of a computer program named ClassyFire (http://classyfire.wishartlab.com/, accessed on 20 December 2022) that is freely accessible as a web server [59]. This new taxonomy for chemical substances consists of up to 11 different levels (kingdom, superclass, class, subclass, etc.), with each of the categories defined by unambiguous, computable structural rules.
As a follow-on, the Chemical Functional Ontology (ChemFOnt), another FAIR-compliant, web-enabled resource (https://www.chemfont.ca, accessed on 20 December 2022), describes the functions and actions of >341,000 biologically important chemical substances, including primary and secondary metabolites, as well as drugs and NPs. The functional hierarchy within ChemFOnt consists of four functional “aspects” (physiological effect; disposition; process; and role), which are subdivided into twelve functional categories (health effects and organoleptic effects; sources, biological locations, and routes of exposure; environmental, natural, and industrial processes; adverse biological roles, normal biological roles, environmental roles, and industrial applications) and a total of >170,000 functional terms. At the time of publishing, ChemFOnt contained almost four million protein-chemical relationships and more than ten million chemical-functional relationships that can be adopted by other databases and software tools and be of utility not only to general chemists but also to researchers involved in genomics, metagenomics, proteomics, and metabolomics [65].
NPs are the result of nature’s exploration of biologically relevant chemical space through eons of evolutionary time, hence their high diversity regarding atom connectivity and functional groups. Because they cover a broad range of sizes, 3D structures, and physicochemical properties that can be related to drug-likeness (including favorable ADME characteristics), NPs are considered not only as potential drugs, but also as an invaluable source of chemical inspiration for the development of new bioactive small molecules useful in chemical biology and medicinal chemistry research. The structural diversity of drugs was early assessed by making use of shape description methods and grouping the atoms of each drug molecule into ring, linker, framework (or scaffold) [66], and side chain [67]. A methodology that calculated the NP-likeness score—a Bayesian measure of similarity with respect to the structural space covered by NPs—proved capable of efficiently separating NPs from synthetic (i.e., man-made) molecules in a cross-validation experiment [68]. Nevertheless, rule-based procedures applied to the automated assignment of NPs to different classes, such as alkaloids, steroids, and flavonoids, have unveiled database-dependent differences in the coverage of chemical space [69]. Beyond that, several cheminformatics techniques have been used to analyze NPs and decompose them into fragments in the belief that their unique substructural features and chemical properties are likely to be optimized for protein recognition and enzyme inhibition. A recent cheminformatic analysis of the structural and physicochemical properties of NP-based drugs in comparison to top-selling brand-name synthetic drugs revealed that macrocycles occupied distinctive and relatively underpopulated regions of chemical space, while chemical probes largely overlapped with synthetic drugs [70].
Ideally, molecular diversity in drug discovery efforts should be focused on what is usually considered drug-like chemical space (aka “drug space”), which may (or may not) fully comply with Lipinski’s “rule of five” [71]. A pioneering initiative to map this space made use of 72 descriptors accounting for size, lipophilicity (calculated log Po/w), polarizability, charge, flexibility (number of nonterminal rotatable bonds), rigidity (total number of rings and rigid bonds), and hydrogen bonding abilities for a set of ~400 compounds encompassing both representative drugs (“core structures”) and a number of “satellite molecules” intentionally placed outside of the drug space (i.e., possessing extreme values in one or several of the desired properties, while containing drug-like chemical fragments). By means of principal component analysis (PCA) and projections to latent structures (PLS) it was possible, after some iterations that involved the inclusion of additional randomly selected active molecules, to extract map coordinates in the form of t-score values and construct a chemical global positioning system (ChemGPS) [72]. The ChemGPS scores were found to describe well the latent structures extracted with PCA from a large set of compounds and appeared to be suitable for comparing multiple libraries and for keeping track of previously explored regions of chemical space. Later work (largely based on cyclooxygenase 1 and/or cyclooxygenase 2 (COX-1/2) inhibition) proposed an expansion of ChemGPS to better cover space for NPs, giving birth to ChemGPS-NP [73], which was further tuned for the improved handling of the chemical diversity encountered in NP research with a view to increasing the probability of hit identification [74]. The public ChemGPS-NP Web tool (http://chemgps.bmc.uu.se/, accessed on 20 December 2022) was then developed to allow for the exploration of NPs by navigating in a consistent 8-dimensional global map of structural characteristics built by means of PCA [75].
Following a different philosophy to chart the known chemical space explored by nature, the structural classification of natural products (SCONP) was devised to accomplish a hierarchical grouping of the scaffolds present in ~170,000 entries from the DNP by establishing parent–child relationships between them and arranging the scaffolds in a tree-like fashion [76]. Some previous processing was necessary that included structure cleansing (i.e., separation from accompanying molecules) and deglycosylation (in the case of glycosides whose active component is the aglycon part). Unfortunately, stereochemistry could not be considered in this early cheminformatic analysis so that the different possible configurations of the NP scaffolds had to be treated as being equivalent. The conversion of the resulting NP scaffolds to SMILES (simplified molecular-input line-entry system) strings [77] allowed for the comparison with those of standard synthetic molecules represented by over 10 million drug-like commercially available samples from the ZINC database [10]. This analysis revealed interesting differences not only between natural and synthetic (i.e., man-made) molecules, but also between scaffolds originating from distinct classes of organisms, i.e., plants, bacteria, and fungi. Visual comparisons of the respective structural features were effectively displayed by plotting the scaffolds according to their frequency distributions [78]. Moreover, a flexible analytics framework named Scaffold Hunter (https://scaffoldhunter.sourceforge.net/, accessed on 20 December 2022) generates and enables the visualization of virtual scaffold trees in bioactive compound collections that easily allow for the identification of new starting points for the design and synthesis of biology-oriented small molecule libraries [79]. Interestingly, a recent cluster analysis of chemical fingerprints and molecular scaffolds of >55,000 compounds reportedly isolated from marine and terrestrial microorganisms showed that three quarters of the MNPs are closely related to compounds isolated from their terrestrial counterparts [80].
Historically, the total synthesis of NPs followed by derivative synthesis (“active analogue approach” or “analogue-oriented synthesis” [81][82]) and semisynthetic procedures aimed at modifying the chemical structure of complex fermentation products have enabled a deeper understanding of structure–activity relationships (SAR). In contrast, the de novo combination of NP fragments in unique arrangements, often by virtue of innovative strategies such as “diversity-oriented synthesis” [83][84], “target-oriented and diversity-oriented organic synthesis” [85], and “synthesis-informed design” [86], has been shown to generate focused NP-like libraries containing compounds endowed with bioactivities unrelated to those of the guiding NP(s) [87][88][89]. Examples of successful workflows of pseudo-NP design and development are “biology-oriented synthesis” [90][91] and “pharmacophore-directed retrosynthesis” [92]. In applying the latter approach, a key first step is to elaborate a tentative pharmacophore, i.e., “an ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response”, as defined by the International Union of Pure and Applied Chemistry (IUPAC) [93], and then devise a retrosynthetic procedure that ensures that the proposed pharmacophore is present in multiple intermediates of increasing complexity, ultimately leading to the NP. An important goal of these synthetic approaches is to find structurally simplified and optimized derivatives with lower molecular weights that can overcome commonly observed limitations, such as poor oral absorption, short half-life, and low blood–brain barrier permeability.

4. Linking NPs to Their Targets: Computational Methodologies for Building Global Networks

The popular term “druggable genome” [94] refers to the genes (or, more appropriately, gene products) that are known or predicted to interact with drugs, ideally resulting in a therapeutic benefit. Although drugs are intended to be selective (i.e., have high affinity for one single target), it is not uncommon for many molecules to bind to more than one protein, giving rise to polypharmacology and side effects. Due to the fact that many drug-target combinations are theoretically possible, the computational exploration of possible interactions can help identify potential targets.
Because the systematic identification of drug targets for NPs, regardless of their origin, using a battery of experimental binding or affinity assays, is both costly and time-consuming, a substantial amount of effort has gone into devising in silico tools that allow for the construction of global networks that connect active compounds to their cellular targets. It is expected that, by using these methods, the resulting system’s pharmacology infrastructure will help to predict new drug targets for pharmacologically uncharacterized NPs and identify secondary targets (off-targets) that can aid in the rationalization of side effects of known molecules [95]. The Drug-Gene Interaction Database (DGIdb 4.0, https://www.dgidb.org/, accessed on 20 December 2022) provides information on drug-gene interactions and druggable gene products collected from publications, databases, and other web sites [96]. The latest update mostly focused on (i) the integration with crowdsourced efforts (e.g., Wikidata) to facilitate term normalization and with the open-data web platform Drug Target Commons (https://dataverse.harvard.edu/dataverse/dtc2tdc, accessed on 20 December 2022) [97] to enable the upload of community-contributed interaction data; and (ii) export to a Network Data Exchange (NDEx) infrastructure [98] for storing, sharing and publishing biological network knowledge. The tool named substructure-drug-target network-based inference (SDTNBI) was devised to prioritize potential targets for old drugs (“drug repositioning”), failed drugs, and new chemical entities by bridging the gap between new chemical entities and known drug-target interactions (DTIs) [99]. A later modification (wSDTNBI) [100] uses weighted DTI networks, whose edge weights are correlated with binding affinities, and network-based VS, which does not rely on the receptors’ 3D structures [101]. The publicly available SwissTargetPrediction web server (http://www.swisstargetprediction.ch, accessed on 20 December 2022) [102] also attempts to predict the most likely target(s) (in mice, rats, or human beings) for a SMILES-defined input molecule by using a computational method that combines different measures of similarities (both in 2D chemical structure and in 3D molecular shape) with known ligands [103]. All of these approaches, together with highly efficient receptor-based ligand docking [104], can be useful to narrow down the number of potential targets, but strict experimental confirmation and validation are needed [105][106].
The attention initially drawn [107] to certain synthetic molecules that were responsible for disproportionate percentages of hits in enzyme-based bioassays but, on closer inspection, turned out to be false actives and therefore nonprogressible hits, leading to the PAINS acronym (Pan Assay INterference compoundS) [108], was later extended to NPs [109]. As a result, some NPs have been designated as “invalid metabolic panaceas” and the concept of “residual complexity” (http://go.uic.edu/residualcomplexity, accessed on 20 December 2022) has emerged [110]. Nowadays, compounds with a PAINS chemotype can be recognized and excluded from bioassays by the judicious use of electronic substructure filters [111] and machine learning approaches [112] (e.g., Hit Dexter, https://nerdd.univie.ac.at/hitdexter3/, accessed on 20 December 2022).
Because the best link connecting NPs to their targets is arguably the experimentally determined 3D structure of the respective complexes, in the following section, I will provide some examples of MNPs and synthetic analogues that were selected on the basis of chemical novelty and submicromolar inhibition data, preferably supported by structural evidence of complex formation with pharmacologically relevant enzyme targets.

References

  1. Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B.A.; Thiessen, P.A.; Yu, B.; et al. PubChem in 2021: New data content and improved web interfaces. Nucleic Acids Res. 2021, 49, D1388–D1395.
  2. Mendez, D.; Gaulton, A.; Bento, A.P.; Chambers, J.; De Veij, M.; Felix, E.; Magarinos, M.P.; Mosquera, J.F.; Mutowo, P.; Nowotka, M.; et al. ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res. 2019, 47, D930–D940.
  3. Pence, H.E.; Williams, A. Chemspider: An online chemical information resource. J. Chem. Educ. 2010, 87, 1123–1124.
  4. Wishart, D.S.; Knox, C.; Guo, A.C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. DrugBank: A comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006, 34, D668–D672.
  5. Wishart, D.S.; Feunang, Y.D.; Guo, A.C.; Lo, E.J.; Marcu, A.; Grant, J.R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074–D1082.
  6. Faulkner, D.J. Marine natural products. Nat. Prod. Rep. 2002, 19, 1R–49R.
  7. Blunt, J.W.; Carroll, A.R.; Copp, B.R.; Davis, R.A.; Keyzers, R.A.; Prinsep, M.R. Marine natural products. Nat. Prod. Rep. 2018, 35, 8–53.
  8. Carroll, A.R.; Copp, B.R.; Davis, R.A.; Keyzers, R.A.; Prinsep, M.R. Marine natural products. Nat. Prod. Rep. 2022, 39, 1122–1171.
  9. Blunt, J.; Munro, M.H.G. Dictionary of Marine Natural Products, with CD-ROM; Chapman & Hall/CRC: Boca Raton, FA, USA, 2008.
  10. Sterling, T.; Irwin, J.J. ZINC 15—Ligand discovery for everyone. J. Chem. Inf. Model. 2015, 55, 2324–2337.
  11. Irwin, J.J.; Tang, K.G.; Young, J.; Dandarchuluun, C.; Wong, B.R.; Khurelbaatar, M.; Moroz, Y.S.; Mayfield, J.; Sayle, R.A. ZINC20-A free ultralarge-scale chemical database for ligand discovery. J. Chem. Inf. Model. 2020, 60, 6065–6073.
  12. Sorokina, M.; Steinbeck, C. Review on natural products databases: Where to find data in 2020. J. Cheminform. 2020, 12, 20.
  13. Kern, F.; Fehlmann, T.; Keller, A. On the lifetime of bioinformatics web services. Nucleic Acids Res. 2020, 48, 12523–12533.
  14. Rutz, A.; Sorokina, M.; Galgonek, J.; Mietchen, D.; Willighagen, E.; Gaudry, A.; Graham, J.G.; Stephan, R.; Page, R.; Vondrasek, J.; et al. The LOTUS initiative for open knowledge management in natural products research. Elife 2022, 11, e70780.
  15. Banerjee, P.; Erehman, J.; Gohlke, B.O.; Wilhelm, T.; Preissner, R.; Dunkel, M. Super Natural II--a database of natural products. Nucleic Acids Res. 2015, 43, D935–D939.
  16. Sorokina, M.; Merseburger, P.; Rajan, K.; Yirik, M.A.; Steinbeck, C. COCONUT online: Collection of Open Natural Products database. J. Cheminform. 2021, 13, 2.
  17. Blunt, J.; Munro, M.; Upjohn, M. The Role of Databases in Marine Natural Products Research. In Handbook of Marine Natural Products; Fattorusso, E., Gerwick, W.H., Taglialatela-Scafati, O., Eds.; Springer Science+Business Media B.V.: Dordrecht, The Netherlands, 2012; pp. 389–421.
  18. Shen, S.M.; Appendino, G.; Guo, Y.W. Pitfalls in the structural elucidation of small molecules. A critical analysis of a decade of structural misassignments of marine natural products. Nat. Prod. Rep. 2022, 39, 1803–1832.
  19. Wishart, D.S. Computational strategies for metabolite identification in metabolomics. Bioanalysis 2009, 1, 1579–1596.
  20. Burns, D.C.; Mazzola, E.P.; Reynolds, W.F. The role of computer-assisted structure elucidation (CASE) programs in the structure elucidation of complex natural products. Nat. Prod. Rep. 2019, 36, 919–933.
  21. Urban, S.; Brkljaca, R.; Hoshino, M.; Lee, S.; Fujita, M. Determination of the absolute configuration of the pseudo-symmetric natural product elatenyne by the crystalline sponge method. Angew. Chem. Int. Ed. Engl. 2016, 55, 2678–2682.
  22. Rinschen, M.M.; Ivanisevic, J.; Giera, M.; Siuzdak, G. Identification of bioactive metabolites using activity metabolomics. Nat. Rev. Mol. Cell Biol. 2019, 20, 353–367.
  23. Muchiri, R.N.; van Breemen, R.B. Affinity selection-mass spectrometry for the discovery of pharmacologically active compounds from combinatorial libraries and natural products. J. Mass Spectrom. 2021, 56, e4647.
  24. Benson, D.A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Sayers, E.W. GenBank. Nucleic Acids Res. 2013, 41, D36–D42.
  25. Magrane, M.; UniProt, C. UniProt Knowledgebase: A hub of integrated protein data. Database 2011, 2011, bar009.
  26. Wang, M.; Carver, J.J.; Phelan, V.V.; Sanchez, L.M.; Garg, N.; Peng, Y.; Nguyen, D.D.; Watrous, J.; Kapono, C.A.; Luzzatto-Knaan, T.; et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 2016, 34, 828–837.
  27. Weber, T.; Kim, H.U. The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production. Synth. Syst. Biotechnol. 2016, 1, 69–79.
  28. Scherlach, K.; Hertweck, C. Mining and unearthing hidden biosynthetic potential. Nat. Commun. 2021, 12, 3864.
  29. Ziemert, N.; Podell, S.; Penn, K.; Badger, J.H.; Allen, E.; Jensen, P.R. The natural product domain seeker NaPDoS: A phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS ONE 2012, 7, e34064.
  30. Klau, L.J.; Podell, S.; Creamer, K.E.; Demko, A.M.; Singh, H.W.; Allen, E.E.; Moore, B.S.; Ziemert, N.; Letzel, A.C.; Jensen, P.R. The Natural Product Domain Seeker version 2 (NaPDoS2) webtool relates ketosynthase phylogeny to biosynthetic function. J. Biol. Chem. 2022, 298, 102480.
  31. Albarano, L.; Esposito, R.; Ruocco, N.; Costantini, M. Genome mining as new challenge in natural products discovery. Mar. Drugs 2020, 18, 199.
  32. Medema, M.H.; Fischbach, M.A. Computational approaches to natural product discovery. Nat. Chem. Biol. 2015, 11, 639–648.
  33. Winter, J.M.; Behnken, S.; Hertweck, C. Genomics-inspired discovery of natural products. Curr. Opin. Chem. Biol. 2011, 15, 22–31.
  34. Lane, A.L.; Moore, B.S. A sea of biosynthesis: Marine natural products meet the molecular age. Nat. Prod. Rep. 2011, 28, 411–428.
  35. Bonet, B.; Teufel, R.; Crusemann, M.; Ziemert, N.; Moore, B.S. Direct capture and heterologous expression of Salinispora natural product genes for the biosynthesis of enterocin. J. Nat. Prod. 2015, 78, 539–542.
  36. Jensen, P.R. Natural products and the gene cluster revolution. Trends Microbiol. 2016, 24, 968–977.
  37. Blin, K.; Shaw, S.; Kloosterman, A.M.; Charlop-Powers, Z.; van Wezel, G.P.; Medema, M.H.; Weber, T. antiSMASH 6.0: Improving cluster detection and comparison capabilities. Nucleic Acids Res. 2021, 49, W29–W35.
  38. Medema, M.H. The year 2020 in natural product bioinformatics: An overview of the latest tools and databases. Nat. Prod. Rep. 2021, 38, 301–306.
  39. Cimermancic, P.; Medema, M.H.; Claesen, J.; Kurita, K.; Wieland Brown, L.C.; Mavrommatis, K.; Pati, A.; Godfrey, P.A.; Koehrsen, M.; Clardy, J.; et al. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 2014, 158, 412–421.
  40. Doroghazi, J.R.; Albright, J.C.; Goering, A.W.; Ju, K.S.; Haines, R.R.; Tchalukov, K.A.; Labeda, D.P.; Kelleher, N.L.; Metcalf, W.W. A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat. Chem. Biol. 2014, 10, 963–968.
  41. Medema, M.H.; Kottmann, R.; Yilmaz, P.; Cummings, M.; Biggins, J.B.; Blin, K.; de Bruijn, I.; Chooi, Y.H.; Claesen, J.; Coates, R.C.; et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 2015, 11, 625–631.
  42. Arnison, P.G.; Bibb, M.J.; Bierbaum, G.; Bowers, A.A.; Bugni, T.S.; Bulaj, G.; Camarero, J.A.; Campopiano, D.J.; Challis, G.L.; Clardy, J.; et al. Ribosomally synthesized and post-translationally modified peptide natural products: Overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 2013, 30, 108–160.
  43. Masso-Silva, J.A.; Diamond, G. Antimicrobial peptides from fish. Pharmaceuticals 2014, 7, 265–310.
  44. Barroso, C.; Carvalho, P.; Goncalves, J.F.M.; Rodrigues, P.N.S.; Neves, J.V. Antimicrobial peptides: Identification of two b-defensins in a teleost fish, the european sea bass (Dicentrarchus labrax). Pharmaceuticals 2021, 14, 566.
  45. Tincu, J.A.; Taylor, S.W. Antimicrobial peptides from marine invertebrates. Antimicrob. Agents Chemother. 2004, 48, 3645–3654.
  46. Sychev, S.V.; Sukhanov, S.V.; Panteleev, P.V.; Shenkarev, Z.O.; Ovchinnikova, T.V. Marine antimicrobial peptide arenicin adopts a monomeric twisted beta-hairpin structure and forms low conductivity pores in zwitterionic lipid bilayers. Pept. Sci. 2017, 110, e23093.
  47. Wang, G.; Li, X.; Wang, Z. APD3: The antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 2016, 44, D1087–D1093.
  48. Broekman, D.C.; Zenz, A.; Gudmundsdottir, B.K.; Lohner, K.; Maier, V.H.; Gudmundsson, G.H. Functional characterization of codCath, the mature cathelicidin antimicrobial peptide from Atlantic cod (Gadus morhua). Peptides 2011, 32, 2044–2051.
  49. Castiglione, F.; Lazzarini, A.; Carrano, L.; Corti, E.; Ciciliato, I.; Gastaldo, L.; Candiani, P.; Losi, D.; Marinelli, F.; Selva, E.; et al. Determining the structure and mode of action of microbisporicin, a potent lantibiotic active against multiresistant pathogens. Chem. Biol. 2008, 15, 22–31.
  50. Chen, E.; Chen, Q.; Chen, S.; Xu, B.; Ju, J.; Wang, H. Mathermycin, a lantibiotic from the marine actinomycete Marinactinospora thermotolerans SCSIO 00652. Appl. Environ. Microbiol. 2017, 83, e00926-17.
  51. Wang, G. The antimicrobial peptide database provides a platform for decoding the design principles of naturally occurring antimicrobial peptides. Protein Sci. 2020, 29, 8–18.
  52. Pirtskhalava, M.; Amstrong, A.A.; Grigolava, M.; Chubinidze, M.; Alimbarashvili, E.; Vishnepolsky, B.; Gabrielian, A.; Rosenthal, A.; Hurt, D.E.; Tartakovsky, M. DBAASP v3: Database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 2021, 49, D288–D297.
  53. Waghu, F.H.; Idicula-Thomas, S. Collection of antimicrobial peptides database and its derivatives: Applications and beyond. Protein Sci. 2020, 29, 36–42.
  54. Gawde, U.; Chakraborty, S.; Waghu, F.H.; Barai, R.S.; Khanderkar, A.; Indraguru, R.; Shirsat, T.; Idicula-Thomas, S. CAMPR4: A database of natural and synthetic antimicrobial peptides. Nucleic Acids Res. 2022, 51, D377–D383.
  55. van Santen, J.A.; Jacob, G.; Singh, A.L.; Aniebok, V.; Balunas, M.J.; Bunsko, D.; Neto, F.C.; Castano-Espriu, L.; Chang, C.; Clark, T.N.; et al. The Natural Products Atlas: An open access knowledge base for microbial natural products discovery. ACS Cent. Sci. 2019, 5, 1824–1833.
  56. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018.
  57. van Santen, J.A.; Poynton, E.F.; Iskakova, D.; McMann, E.; Alsup, T.A.; Clark, T.N.; Fergusson, C.H.; Fewer, D.P.; Hughes, A.H.; McCadden, C.A.; et al. The Natural Products Atlas 2.0: A database of microbially-derived natural products. Nucleic Acids Res. 2022, 50, D1317–D1323.
  58. Jones, M.R.; Pinto, E.; Torres, M.A.; Dorr, F.; Mazur-Marzec, H.; Szubert, K.; Tartaglione, L.; Dell’Aversano, C.; Miles, C.O.; Beach, D.G.; et al. CyanoMetDB, a comprehensive public database of secondary metabolites from cyanobacteria. Water Res. 2021, 196, 117017.
  59. Djoumbou Feunang, Y.; Eisner, R.; Knox, C.; Chepelev, L.; Hastings, J.; Owen, G.; Fahy, E.; Steinbeck, C.; Subramanian, S.; Bolton, E.; et al. ClassyFire: Automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 2016, 8, 61.
  60. Kim, H.W.; Wang, M.; Leber, C.A.; Nothias, L.F.; Reher, R.; Kang, K.B.; van der Hooft, J.J.J.; Dorrestein, P.C.; Gerwick, W.H.; Cottrell, G.W. NPClassifier: A deep neural network-based structural classification tool for natural products. J. Nat. Prod. 2021, 84, 2795–2807.
  61. Sunagawa, S.; Coelho, L.P.; Chaffron, S.; Kultima, J.R.; Labadie, K.; Salazar, G.; Djahanschiri, B.; Zeller, G.; Mende, D.R.; Alberti, A.; et al. Ocean plankton. Structure and function of the global ocean microbiome. Science 2015, 348, 1261359.
  62. Cao, S.; Zhang, W.; Ding, W.; Wang, M.; Fan, S.; Yang, B.; McMinn, A.; Wang, M.; Xie, B.B.; Qin, Q.L.; et al. Structure and function of the Arctic and Antarctic marine microbiota as revealed by metagenomics. Microbiome 2020, 8, 47.
  63. Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25, 25–29.
  64. Gene Ontology, C. The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 2021, 49, D325–D334.
  65. Wishart, D.S.; Girod, S.; Peters, H.; Oler, E.; Jovel, J.; Budinski, Z.; Milford, R.; Lui, V.W.; Sayeeda, Z.; Mah, R.; et al. ChemFOnt: The chemical functional ontology resource. Nucleic Acids Res. 2022, 51, D1220–D1229.
  66. Bemis, G.W.; Murcko, M.A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 1996, 39, 2887–2893.
  67. Bemis, G.W.; Murcko, M.A. Properties of known drugs. 2. Side chains. J. Med. Chem. 1999, 42, 5095–5099.
  68. Ertl, P.; Roggo, S.; Schuffenhauer, A. Natural product-likeness score and its application for prioritization of compound libraries. J. Chem. Inf. Model. 2008, 48, 68–74.
  69. Chen, Y.; García de Lomana, M.; Friedrich, N.O.; Kirchmair, J. Characterization of the chemical space of known and readily obtainable natural products. J. Chem. Inf. Model. 2018, 58, 1518–1532.
  70. Stone, S.; Newman, D.J.; Colletti, S.L.; Tan, D.S. Cheminformatic analysis of natural product-based drugs and chemical probes. Nat. Prod. Rep. 2022, 39, 20–32.
  71. Zhang, M.Q.; Wilkinson, B. Drug discovery beyond the ‘rule-of-five’. Curr. Opin. Biotechnol. 2007, 18, 478–488.
  72. Oprea, T.I.; Gottfries, J. Chemography: The art of navigating in chemical space. J. Comb. Chem. 2001, 3, 157–166.
  73. Larsson, J.; Gottfries, J.; Bohlin, L.; Backlund, A. Expanding the ChemGPS chemical space with natural products. J. Nat. Prod. 2005, 68, 985–991.
  74. Larsson, J.; Gottfries, J.; Muresan, S.; Backlund, A. ChemGPS-NP: Tuned for navigation in biologically relevant chemical space. J. Nat. Prod. 2007, 70, 789–794.
  75. Rosén, J.; Lövgren, A.; Kogej, T.; Muresan, S.; Gottfries, J.; Backlund, A. ChemGPS-NP(Web): Chemical space navigation online. J. Comput. Aided Mol. Des. 2009, 23, 253–259.
  76. Koch, M.A.; Schuffenhauer, A.; Scheck, M.; Wetzel, S.; Casaulta, M.; Odermatt, A.; Ertl, P.; Waldmann, H. Charting biologically relevant chemical space: A structural classification of natural products (SCONP). Proc. Natl. Acad. Sci. USA 2005, 102, 17272–17277.
  77. Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Model. 1988, 28, 31–36.
  78. Ertl, P.; Schuhmann, T. Cheminformatics analysis of natural product scaffolds: Comparison of scaffolds produced by animals, plants, fungi and bacteria. Mol. Inform. 2020, 39, e2000017.
  79. Schafer, T.; Kriege, N.; Humbeck, L.; Klein, K.; Koch, O.; Mutzel, P. Scaffold Hunter: A comprehensive visual analytics framework for drug discovery. J. Cheminform. 2017, 9, 28.
  80. Voser, T.M.; Campbell, M.D.; Carroll, A.R. How different are marine microbial natural products compared to their terrestrial counterparts? Nat. Prod. Rep. 2022, 39, 7–19.
  81. Seiple, I.B.; Zhang, Z.; Jakubec, P.; Langlois-Mercier, A.; Wright, P.M.; Hog, D.T.; Yabu, K.; Allu, S.R.; Fukuzaki, T.; Carlsen, P.N.; et al. A platform for the discovery of new macrolide antibiotics. Nature 2016, 533, 338–345.
  82. Könst, Z.A.; Szklarski, A.R.; Pellegrino, S.; Michalak, S.E.; Meyer, M.; Zanette, C.; Cencic, R.; Nam, S.; Voora, V.K.; Horne, D.A.; et al. Synthesis facilitates an understanding of the structural basis for translation inhibition by the lissoclimides. Nat. Chem. 2017, 9, 1140–1149.
  83. Tan, D.S.; Foley, M.A.; Shair, M.D.; Schreiber, S.L. Stereoselective synthesis of over two million compounds having structural features both reminiscent of natural products and compatible with miniaturized cell-based assays. J. Am. Chem. Soc. 1998, 120, 8565–8566.
  84. Galloway, W.R.J.D.; Isidro-Llobet, A.; Spring, D.R. Diversity-oriented synthesis as a tool for the discovery of novel biologically active small molecules. Nat. Commun. 2010, 1, 80.
  85. Schreiber, S.L. Target-oriented and diversity-oriented organic synthesis in drug discovery. Science 2000, 287, 1964–1969.
  86. Wender, P.A. Toward the ideal synthesis and molecular function through synthesis-informed design. Nat. Prod. Rep. 2014, 31, 433–440.
  87. Cremosnik, G.S.; Liu, J.; Waldmann, H. Guided by evolution: From biology oriented synthesis to pseudo natural products. Nat. Prod. Rep. 2020, 37, 1497–1510.
  88. Karageorgis, G.; Foley, D.J.; Laraia, L.; Waldmann, H. Principle and design of pseudo-natural products. Nat. Chem. 2020, 12, 227–235.
  89. Karageorgis, G.; Foley, D.J.; Laraia, L.; Brakmann, S.; Waldmann, H. Pseudo natural products-chemical evolution of natural product structure. Angew. Chem. Int. Ed. Engl. 2021, 60, 15705–15723.
  90. Wetzel, S.; Bon, R.S.; Kumar, K.; Waldmann, H. Biology-oriented synthesis. Angew. Chem. Int. Ed. Engl. 2011, 50, 10800–10826.
  91. van Hattum, H.; Waldmann, H. Biology-oriented synthesis: Harnessing the power of evolution. J. Am. Chem. Soc. 2014, 136, 11853–11859.
  92. Abbasov, M.E.; Alvariño, R.; Chaheine, C.M.; Alonso, E.; Sánchez, J.A.; Conner, M.L.; Alfonso, A.; Jaspars, M.; Botana, L.M.; Romo, D. Simplified immunosuppressive and neuroprotective agents based on gracilin A. Nat. Chem. 2019, 11, 342–350.
  93. Wermuth, C.G.; Ganellin, C.R.; Lindberg, P.; Mitscher, L.A. Glossary of terms used in medicinal chemistry (IUPAC Recommendations 1998). Pure Appl. Chem. 1998, 70, 1129–1143.
  94. Hopkins, A.L.; Groom, C.R. The druggable genome. Nat. Rev. Drug Discov. 2002, 1, 727–730.
  95. Fang, J.; Wu, Z.; Cai, C.; Wang, Q.; Tang, Y.; Cheng, F. Quantitative and systems pharmacology. 1. In silico prediction of drug-target interactions of natural products enables new targeted cancer therapy. J. Chem. Inf. Model. 2017, 57, 2657–2671.
  96. Freshour, S.L.; Kiwala, S.; Cotto, K.C.; Coffman, A.C.; McMichael, J.F.; Song, J.J.; Griffith, M.; Griffith, O.L.; Wagner, A.H. Integration of the Drug-Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 2021, 49, D1144–D1151.
  97. Tang, J.; Tanoli, Z.U.; Ravikumar, B.; Alam, Z.; Rebane, A.; Vaha-Koskela, M.; Peddinti, G.; van Adrichem, A.J.; Wakkinen, J.; Jaiswal, A.; et al. Drug Target Commons: A community effort to build a consensus knowledge base for drug-target interactions. Cell Chem. Biol. 2018, 25, 224–229.e2.
  98. Pillich, R.T.; Chen, J.; Churas, C.; Liu, S.; Ono, K.; Otasek, D.; Pratt, D. NDEx: Accessing network models and streamlining network biology workflows. Curr. Protoc. 2021, 1, e258.
  99. Wu, Z.; Cheng, F.; Li, J.; Li, W.; Liu, G.; Tang, Y. SDTNBI: An integrated network and chemoinformatics tool for systematic prediction of drug-target interactions and drug repositioning. Brief. Bioinform. 2017, 18, 333–347.
  100. Wu, Z.; Ma, H.; Liu, Z.; Zheng, L.; Yu, Z.; Cao, S.; Fang, W.; Wu, L.; Li, W.; Liu, G.; et al. wSDTNBI: A novel network-based inference method for virtual screening. Chem. Sci. 2022, 13, 1060–1079.
  101. Wu, Z.; Li, W.; Liu, G.; Tang, Y. Network-based methods for prediction of drug-target interactions. Front. Pharmacol. 2018, 9, 1134.
  102. Gfeller, D.; Michielin, O.; Zoete, V. Shaping the interaction landscape of bioactive molecules. Bioinformatics 2013, 29, 3073–3079.
  103. Gfeller, D.; Grosdidier, A.; Wirth, M.; Daina, A.; Michielin, O.; Zoete, V. SwissTargetPrediction: A web server for target prediction of bioactive small molecules. Nucleic Acids Res. 2014, 42, W32–W38.
  104. Gentile, F.; Yaacoub, J.C.; Gleave, J.; Fernandez, M.; Ton, A.T.; Ban, F.; Stern, A.; Cherkasov, A. Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking. Nat. Protoc. 2022, 17, 672–697.
  105. Keiser, M.J.; Setola, V.; Irwin, J.J.; Laggner, C.; Abbas, A.I.; Hufeisen, S.J.; Jensen, N.H.; Kuijer, M.B.; Matos, R.C.; Tran, T.B.; et al. Predicting new molecular targets for known drugs. Nature 2009, 462, 175–181.
  106. Lounkine, E.; Keiser, M.J.; Whitebread, S.; Mikhailov, D.; Hamon, J.; Jenkins, J.L.; Lavan, P.; Weber, E.; Doak, A.K.; Cote, S.; et al. Large-scale prediction and testing of drug activity on side-effect targets. Nature 2012, 486, 361–367.
  107. McGovern, S.L.; Caselli, E.; Grigorieff, N.; Shoichet, B.K. A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 2002, 45, 1712–1722.
  108. Baell, J.B.; Holloway, G.A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 2010, 53, 2719–2740.
  109. Baell, J.B. Feeling Nature’s PAINS: Natural products, natural product drugs, and pan assay interference compounds (PAINS). J. Nat. Prod. 2016, 79, 616–628.
  110. Bisson, J.; McAlpine, J.B.; Friesen, J.B.; Chen, S.N.; Graham, J.; Pauli, G.F. Can invalid bioactives undermine natural product-based drug discovery? J. Med. Chem. 2016, 59, 1671–1690.
  111. Baell, J.B.; Nissink, J.W.M. Seven year itch: Pan-assay interference compounds (PAINS) in 2017-utility and limitations. ACS Chem. Biol. 2018, 13, 36–44.
  112. Stork, C.; Chen, Y.; Sicho, M.; Kirchmair, J. Hit Dexter 2.0: Machine-learning models for the prediction of frequent hitters. J. Chem. Inf. Model. 2019, 59, 1030–1043.
More
Upload a video for this entry
Information
Contributor MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :
View Times: 884
Revisions: 3 times (View History)
Update Date: 23 Feb 2023
Academic Video Service