Spinal muscular atrophy (SMA), one of the leading inherited causes of child mortality, is a rare neuromuscular disease arising from loss-of-function mutations of the survival motor neuron 1 (SMN1) gene, which encodes the SMN protein. When lacking the SMN protein in neurons, patients suffer from muscle weakness and atrophy, and in the severe cases, respiratory failure and death. Several therapeutic approaches show promise with human testing and three medications have been approved by the U.S. Food and Drug Administration (FDA) to date. Despite the shown promise of these approved therapies, there are some crucial limitations, one of the most important being the cost. The FDA-approved drugs are high-priced and are shortlisted among the most expensive treatments in the world. The price is still far beyond affordable and may serve as a burden for patients. The blooming of the biomedical data and advancement of computational approaches have opened new possibilities for SMA therapeutic development.
Despite the discovery of promising therapeutic strategies, the limitations, including the treatment viability (in the case of nusinersen), long-term effects, side effects and cost, among others, are highlighted. As the drugs need to pass through the blood–brain barrier (BBB), nusinersen must be administrated locally through an intrathecal injection. This route of administration is challenging and requires sophisticated personnel and technique, such as image-guided technique, particularly for patients with scoliosis and/or spinal deformity [56]. Moreover, elevated costs of nusinersen (~USD $125,000 per injection) associated with screening and subsequent treatment (~USD $750,000 in the first year and ~USD $375,000 annually for subsequent year) place this drug among the most expensive drugs [6][57]. For the latest approved gene therapy, onasemnogene abeparvovec costs ~USD $2.125 million per injection, although only a single treatment is required for each SMA type I patient [58], while the cost of risdiplam (the most recent FDA-approved drug) is yet unknown. Additionally, as all are relatively new therapies, there are no longitudinal studies for long-term effects, although there are a plethora of studies for side effects. Therefore, a more cost-effective drug with an alternative route of administration is required for this devastating SMA.
Drug repurposing, also known as drug repositioning, is one of the emerging potential approaches to circumvent the cost and time required for the development of an efficacious treatment [60][63]. It is defined as a process of identifying new therapeutic indications for an approved drug. Recently, with the encouragement of fast track marketing authorization procedure (FDA approvals), this approach has been widely used for rare diseases [63], including SMA [64], because it offers several benefits over the classical de novo development process of drugs. The approved drug compounds, in essence, have passed safety efficacy, allowing an omission of Phase I clinical trials [64][65].
Several studies have successfully repurposed FDA-approved drugs for SMA treatment and showed plausible in vitro activities, such as enhancing the SMN2 promoter activity, modulating SMN2 splicing and stabilizing SMN2 mRNA or SMN protein [51][64][66]. Histone deacetylase inhibitors (HDAC), including sodium butyrate, phenylbutyrate and valproic acid (VPA), among others, to date, have been explored with SMN2 promoter activity [66][67][68][69][70]. In essence, SMN-independent drugs are centred on neuroprotective and muscle enhancing approaches. In referencing the localization of the SMN protein in neuronal cells, neuroprotective drugs for other CNS diseases could be a better option to reposition for preventing and/or delaying motor neuron death in SMA. Approved neuroprotective drugs, such as riluzole, hydroxyurea and rasagiline, which modulate regulatory pathways in CNS, may be an option for SMA therapy [51][64][66].
Given the potential of the drug repurposing approach, with the combination of publicly available databases and computational methods, the in silico-based approach may provide benefits, in terms of time and cost, towards the drug discovery process by narrowing down the top hits through in silico validations [71]. Public repositories for relevant experimental and biological data, including chemical structures, gene expression, drug disease association, phenotypic traits, side effects and more, are treasure troves for in silico drug repurposing. Owing to the wealth of multi-omics data, different methods have been adopted in drug repurposing, which can be divided into two major categories: (i) drug-oriented and (ii) disease/therapy-oriented [72].
Drug-oriented drug repurposing strategies require the knowledge of cheminformatics and bioinformatics as a foundation, including drug information, chemical structures of drug and target, drug-target network, signalling or metabolic pathway and genomic information. The disease-oriented approach is only applicable if the information of the disease model is available and commonly used to study the contribution of pharmacological characteristics towards drug repositioning effort on a particular disease. The blooming of drug repurposing resources and the advances in computational sciences give rise to the development of novel algorithms/tools and approaches that are capable of capitalizing on publicly available data.
Network biology epitomizes the cell as a cluster of molecules interacting with one another and aims to illustrate the emergence of cellular phenotype from the network of molecular interactions [73]. The networks can be regarded as establishing the mechanistic bridge between the constituent molecules of a cell and the phenotypes that the cells demonstrate. This perspective alone considers the cellular mechanism of disease to be materialized due to networks of pathological interactions that occur only in the disease state. In this context, drug discovery can, hence, be perceived as the search for agents that significantly disrupt these pathological networks. NDD, as a whole, aims to identify signatures of molecular perturbations; that is, collections of multiple proteins, that significantly disturb the structural integrity of the cellular networks bringing forth the targeted disease mechanism [74]. The search space of therapeutics, such as small molecules, biologics or other agents, can then be screened and narrowed down based on their ability to produce the identified perturbation signature. It should be acknowledged that the compounds of this scheme are not expected to directly bind to all proteins within the identified signature, but rather to produce a downstream, functional effect on the molecules making up the signature [75]. This approach is far removed from the traditional target-driven drug discovery that focuses on specific drug targets, whose downstream effects will significantly perturb the disease phenotype without much emphasis on cellular networks for understanding the underlying disease mechanisms.
As opposed to the canonical SMN-independent treatment based on many disease-modifying pathways, potential drug targets may be found on the periphery of the pathways using the NDD approach. A network analysis based on the two main proteins (Figure 4), SMN1 and SMN2, as protein input in GeneMANIA (https://genemania.org/) [76], has generated a network of putative interacting proteins that works in unison to bring about the phenotypes as seen in SMA. Proteins such as GEMINs [77], SNRPB [78], DDX20 [79] and PFN2 [80] appear to be highly correlated to the functioning of SMN1 and SMN2. These proteins are essential to SMN in forming macromolecular complexes (e.g., SMN-GEMINs, SMN-snRNPs) to chaperon the assembly of small nuclear ribonucleoproteins (snRNPs) that are vital to pre-mRNA splicing for producing the final SMN1 and SMN2 proteins [77]. Modulating these proteins in the cellular network within the context of SMA may serve as an opportunity to develop novel therapeutics complementary to the conventional SMN-dependent treatments in addressing the challenge of creating a robust and sustainable solution to curing SMA.
Figure 4. Protein network based on two main proteins, SMN1 and SMN2, and their respective interactions with other proteins related to SMA, generated using GeneMANIA [76] (https://genemania.org/). The most prevalent network relationship, reported by literature, among the proteins is physical interactions (pink color) at 67.64%, as visually shown by the line thickness, while the smallest belong to the shared protein domains at 0.59%.
With the advances of network biology, the rapid growth of publicly available biomedical data and advanced computational analytics, the NDD approach, a mechanistic based approach, proposes an alternative to identify the novel target as potential SMN-independent treatment.
Driven by the big data in the field of biomedical and/or healthcare, the advancement of algorithms and technology such as deep learning (DL), graphical processing units (GPUs) and Google’s tensor processing units (TPUs) enable better predictive capability by shortening the computing time [81][82]. To date, AI has been extensively adopted to support healthcare services and research. Virtual screening [83], quantitative structure-activity relationship (QSAR) [84], de novo drug design [85][86], drug repurposing [87] and chemical space visualization [88] utilized ML extensively to reduce the gap in the conventional methods in drug discovery, while DL shows promise in proposing potent drug candidates using their properties and toxicity risks [9]. Uptake from the pharmaceutical industry is still lagged, especially for rare diseases. Given the breadth of AID, we summarized the pipeline and its pre-requisites (Figure 5).
Figure 5. Machine learning applications in the drug discovery pipeline. Promising developments of pioneering ML research has brought forth unprecedented advances across various stages of the traditional drug development pipeline, especially the concept of automation in the early drug discovery process of target identification and validation & compound screening and lead discovery; relying on the domain of NLP in AI to find prospective drug targets by scanning upon thousands of relevant literature based on contextual information in research papers, and integrating AI with synthesis robots to explore unknown reaction space to search for drug candidates in which multiple chemical experiments are conducted automatically in real-time to assess the reproducibility of chemical reactions and discover new reaction outcomes. AI in the preclinical development has been a game-changer for patient selection in Phase II and III clinical trials by identifying and predicting human-relevant biomarkers of diseases, thus preventing unnecessary toxicities and side effects of consuming the experimental drugs for the designated patients [89].
Through a closer inspection of AI techniques in accelerating drug discovery, there are several common machine learning methods being employed to address the challenges in two major areas of drug development: (i) design and discovery and preclinical research; and (ii) clinical research and safety monitoring.
The task of finding a successful, novel drug as treatment for common diseases is predominantly a daunting yet arduous process, which is even more challenging for a rare genetic neurological disorder such as SMA. Many research and development pharmaceutical companies and research institutions are hesitant to pursue the drug development for rare diseases due to the small market size, high cost, possibly low return and lack of information about the disease, drugs and corresponding drug targets. Recently, CADD approaches have shown promising potentials in facilitating the drug discovery process and may be able to overcome the limiting bottlenecks of its traditional counterparts. Along with the advances of the knowledge of computational biology and informatics database, the opportunities provided by drug repurposing cannot be underestimated. The interactions of a drug and a target is a critical point of drug discovery. This information aids to establish correlations between diseases and targets in order to determine the therapeutic effect of drugs on various diseases. Hence, the well-known drug–disease relationships that has been established using network biology will help accelerate the target identification and lead optimization process for pre-clinical drug development. Integrated with the domain-specific AI in the ‘chemical big data’, the novel approach could potentially serve as a panacea by increasing the efficiency of certain aspects of the drug discovery process. Despite the promising potential offered by CADD, there are several challenges, including the access of databases consisting all the approved drugs and their detailed profiles, in-depth knowledge of disease, particularly for multifaceted disease, among others to capitalize the benefit of CADD in advancing the domain of drug research and development.
This entry is adapted from the peer-reviewed paper 10.3390/ijms22168962