Alternative splicing (AS) is a critical post-transcriptional regulatory mechanism used by more than 95% of transcribed human genes and responsible for structural transcript variation and proteome diversity.
In higher eukaryotes, the primary gene transcripts, also called precursor messenger RNAs (pre-mRNAs), undergo a finely tuned post-transcriptional regulatory process that removes the non-coding regions (introns) and splices together the coding sequences (exons), thus generating the mature mRNAs. This mechanism is designated as pre-mRNA splicing and is a critical step in gene expression. In addition, it is well known that the splicing patterns of a gene vary widely as result of the process of alternative splicing (AS) that differentially retains or excludes certain exons from the pre-mRNA transcript. Consequently, various combinations of exons from a single gene can produce a diversity of mRNA variants, which is determinant to structural transcript variation and proteome diversity [1] and can generate different protein isoforms with related, distinct, or even opposing functions [2][3][2,3]. Remarkably, AS is a widespread event affecting more than 95% of transcribed human genes, as suggested by data provided by whole transcriptome sequencing projects [2][4][2,4]. This complex and tightly regulated mechanism is shared across different tissues and developmental stages, and frequently dysregulated in various human diseases, including cancer [5]. This dysregulation was verified in various types of cancer through detection of aberrant splicing patterns in tumor tissues when compared to their normal counterparts by high-throughput sequencing techniques [6][7][8][9][6,7,8,9]. Additionally, accumulating evidence clearly supports that the aberrant splicing profiles found in cancer are contributing to neoplastic transformation, cancer progression, and therapy resistance [10][11][10,11]. Therefore, it is of utmost relevance to identify pathological splicing isoforms for the development of new effective biomarkers, as well as to clarify the mechanisms behind aberrant AS, thereby elucidating its impact on cancer and providing novel therapeutic strategies.
Pre-mRNA splicing consists of a multistep process orchestrated by the spliceosome, a huge RNA/protein complex comprising five small nuclear ribonucleoproteins (snRNPs; U1, U2, U4, U5, and U6) and numerous associated proteins [12,13]. Briefly, the reaction initiates with the assembly of an initial spliceosome complex through recognition of critical consensus splice sites at the pre-mRNA transcript, as schematically represented in Figure 1A. It comprises a stepwise process that begins with the recruitment of U1 snRNP to the 5′ splice site. Then, the splicing factor 1 (SF1), U2 snRNP auxiliary factor 2 (U2AF2), and U2 snRNP auxiliary factor (U2AF) 2, and U2AF 1recognize the branch point site (BPS), the polypyrimidine (poly-Y) tract, and the AG dinucleotide of the 3′ splice site region, respectively. The occupancy of these three consensus sequences induces the association of U2 snRNP with the BPS, which is further stabilized by the U2 snRNP component SF3B1. Consequently, intronic recognition prompts the engagement of U4/U6/U5 tri-snRNP with the complex, and subsequent formation of a catalytically inactive complex. This leads to several conformational and compositional rearrangements of spliceosomal components, including the dissociation of U1 and U4 snRNPs, which in turn promotes the formation of the activated spliceosome that catalyzes the splicing reaction [14]. Transcripts from nearly all protein-coding genes undergo one or more types of AS, giving rise to different mRNAs that differ in transcript degradation or are translated into alternative protein isoforms in a cell type-, organ-, or tissue-specific manner [2,4,15]. In higher eukaryotes, among the currently known AS events represented in Figure 1B, the most common is exon skipping [16], accounting for approximately 40% of all AS events, in which a cassette exon is removed from the pre-mRNA together with its flaking introns. Besides this, switching between alternative 5′ and 3′ splice site positions, mutually exclusive splicing of adjacent exons and differential retention of introns are also important variations of AS (Figure 1B). Other types of AS events include the use of alternative transcription start sites and alternative polyadenylation.
Figure 1. Regulation of pre-mRNA splicing. (A) Stepwise assembly of spliceosome on the pre-mRNA and catalysis of the splicing reaction to generate mature spliced mRNA. (B) Schematic representation of the most common alternative splicing AS events. The grey, yellow, red, and blue boxes represent different exons. The solid black and dotted grey lines indicate distinct splicing events. (C) Complex interplay between cis- and trans-acting factors in the regulation of AS. RNA-binding motif (RBM) proteins, serine/arginine-rich (SR) proteins, and heterogeneous (hn) ribonucleoproteins (hnRNPs) bind to exonic or intronic regulatory elements to promote or prevent the recognition of either 3′ or 5′ splice sites (ss) by the small nuclear (sn) RNPs (snRNPs) and splicing factors. The solid and dotted black arrows represent binding stimulation and inhibition, respectively; (ss—splice sites; BPS—branch point site; poly-Y—polypyrimidine tract; pre-mRNA—precursor messenger RNA; snRNPs—small nuclear ribonucleoprotein particle; SF1—splicing factor 1; U2AF—U2 snRNP auxiliary factor).
In AS, the regulated process consists of the recognition of an exon by the spliceosome. For this, splice site utilization is further regulated by cis-acting splicing-regulatory elements, which either promote or inhibit the use of adjacent splice sites by recruiting trans-acting splicing factors [12][17]. Thus, they are classified into exonic or intronic splicing enhancers (ESE/ISE) or silencers (ESS/ISS), depending on their positions and functions (Figure 1C). In general, enhancers are recognized by trans-acting factors belonging to the serine/arginine-rich (SR) protein family to facilitate splice site recognition and exon inclusion [13][18]. On the other hand, silencers usually interact with other types of trans-acting factors such as heterogeneous ribonucleoproteins (hnRNPs) to inhibit splice site recognition and promote exon skipping [2]. However, several AS events exist in which SR or hnRNP proteins act as inhibitors or enhancers of splicing, respectively.
Cancer mainly evolves through successive genetic alterations and genomic dysregulation, but is also affected by the tumor microenvironment. These render oncogenes constitutively active and inactivate tumor-suppressor genes. As a result, cancer cells acquire specific abilities during tumor development, including self-sufficiency in growth signals, insensitivity to growth inhibitory signals, evasion of apoptosis, limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis [14][19]. These processes can also be dysregulated by AS, which in turn can generate variant proteins with altered physiological function [3]. Particularly, a recent systematic study performed by Kahles et al. reported that AS events are more frequent in cancer tissues compared to normal ones, and many of them are cancer-type specific [15][20]. Among the factors that can trigger aberrant AS, somatic mutations that disrupt splicing regulatory motifs, as well as mutations or expression changes in components of the core splicing machinery or splicing auxiliary factors, are frequently described [6][7][16][17][18][19][6,7,21,22,23,24].
Aberrant splicing in cancer has been widely linked to mutations creating cis-regulatory motifs that generate novel splice sites, as demonstrated by the discovery of almost 2000 splice site-creating mutations through a robust whole-exome analysis encompassing more than 8000 tumor samples across 33 cancer types [20][25]. One of the AS events frequently associated with these somatic mutations is intron retention, and mainly affects tumor suppressor genes such as TP53, ARID1A, and PTEN [7]. Importantly, most of the intron retention events are able to induce frameshifts in pre-mRNA sequence, resulting in the generation of premature termination codons (PTCs), which in turn leads to the degradation of the transcript through nonsense-mediated mRNA decay (NMD) or to the production of truncated proteins (e.g., dominant negative isoforms or neo-antigens). Interestingly, somatic exonic mutations have also been reported in oncogenes, particularly in ESE and ESS sequences [6], and associated with the generation of pro-tumorigenic variants.
Recurrent somatic mutations affecting the components of the early spliceosome complex formation have frequently been described in cancer, particularly in hematological malignancies, including myelodysplastic syndromes (MDS), other myeloid neoplasms, and chronic lymphocytic leukemia (CLL) [21][22][23][26,27,28]. Among the genes most affected by these mutations that almost always occur in a mutually exclusive manner are SF3B1 (splicing factor 3b subunit 1), SRSF2 (serine/arginine-rich splicing factor 2), U2AF1 (U2 small nuclear RNA auxiliary factor 1), and ZRSR2 (zinc finger RNA binding motif and serine/arginine rich 2) [21][26]. SF3B1, a subunit of the U2 snRNP that recognizes the BPS, is the most commonly mutated splicing regulator in numerous cancers, with a prevalence ranging from 5% in breast cancer to 81% in an MDS subtype [24][29]. Cancer-associated SF3B1 mutations are located within HEAT (Huntingtin, Elongation factor 3, protein phosphatase 2A, Targets of rapamycin 1) domains, which are involved in protein–protein interactions and clustered in hotspots, namely K700, E622, R625, H662, and K666. Specifically, they are mainly related with the binding of SF3B1 to cryptic 3′ splice sites, located in regions with shorter and weaker poli-Y tracts, and consequently linked to aberrant BPS usage [17][25][26][22,30,31]. This abnormal assembly of spliceosome originates many mRNAs with a PTC, which are subsequently degraded by NMD.
Although the mechanism that induces the change of 3′ splice site usage by SF3B1 is not fully elucidated, it is hypothesized that these mutations alter the interaction of SF3B1 with other spliceosomal components required for BPS recognition. SRSF2 is a member of the SR protein family that binds to specific ESE sequences, namely CCNG or GGNG, through its RNA recognition motif (RRM) domain, and recruits U1 snRNP and U2AF to the 5′ and 3′ flanking splice sites, respectively [27][32]. This splicing regulator has also been found recurrently mutated, particularly in patients with MDS and chronic myelomonocytic leukemia (CMML) [21][26]. SRSF2 mutations predominantly occur at the P95 residue, which is located near the RRM domain [21][26]. According to several reports, these mutations change the RNA-binding affinity of SRSF2, favoring the recognition of C-rich CCNG over G-rich GGNG motifs in ESE consensus sites, which in turn leads to misregulation of exon inclusion [28][29][33,34]. The gene encoding UA2F1 is also mutated in myeloid malignancies, as well as in lung adenocarcinomas [21][30][31][26,35,36]. U2AF1 hotspot mutations occur almost exclusively at S34 and Q157 residues within the two conserved zinc-finger domains, thus affecting the recognition of the 3′ splice site AG motif [32][33][37,38]. In contrast to mutually exclusive hotspot mutations described for SF3B1, SRSF2, and U2AF1, ZRSR2 mutations are distributed throughout the gene and most are consistent with a loss-of-function phenotype [18][23]. In 2015, in addition to the major (or U2) spliceosome, ZRSR2 was also characterized as an essential component of the minor (or U12) spliceosome that catalyzes the processing of a distinct class of introns (U12-type introns). Particularly, it is involved in 3′ splice site recognition in U12 snRNA-dependent splicing, so that mutations in this gene are associated with an increase in the retention of U12-type introns [18][23].
Apart from genomic mutations, the pre-mRNA splicing of many genes related to cancer pathogenesis can also be disturbed by changes of the copy number or expression levels of splicing factors [34][39]. Actually, abnormal expression of several splicing factors have frequently been reported in solid tumors and closely associated with cancer development and progression, even in the absence of mutations [35][36][37][38][40,41,42,43]. One of the best characterized is the serine-arginine splicing factor 1 (SRSF1; formerly known as ASF or SF2), an SR protein involved in both constitutive and AS, as well as in other cellular processes. It is upregulated in several human tumors, including colon, breast, thyroid, small intestine, kidney, and lung, and its experimentally induced overexpression leads to the transformation of human and mouse mammary epithelial cells, suggesting that it acts as a proto-oncogene [39][40][41][44,45,46]. Until now, SRSF1 upregulation has been shown to affect many AS events in cancer-associated genes. In particular, SRSF1 overexpression induces an increase in the levels of oncogenic protein isoforms of RON [42][47], MNK2, and S6K1 [39][44] and of the anti-apoptotic isoforms Bcl-xL and MCL-1L [43][48], and a loss of the tumor suppressor isoform of BIN1 [39][44]. Curiously, the overexpression of hnRNP A1 and hnRNP A2/B1, two factors previously suggested to antagonize SR proteins, was also reported in lung, breast, and brain tumors [44][45][46][47][49,50,51,52]. Interestingly, in glioblastoma (GBM) cells, hnRNP A2/B1 showed splicing effects similar to the proto-oncogenic SR protein SRSF1 [47][52]. More recently, hnRNP A2 (as well as B1 and K) has been associated with enhanced expression of anti-apoptotic variants of BIN1 and CASP9, and decreased expression of the pro-apoptotic variant Bcl-xS [43][48], promoting the same phenotypic response as SRSF1 overexpression.
The major drivers of aberrant splicing profiles appear to be changes in the expression levels of splicing factors; however, the mechanisms behind the altered expression of the splicing factors in tumors are not yet fully understood. Although sporadic somatic mutations in genes encoding splicing factors have already been recurrently detected in solid tumors [38][43], it is widely recognized that oncogenic signaling has a central role [48][53]. Actually, abnormal activation of signaling pathways has been extensively reported in cancer. For instance, in colon cancer, oncogenic Kirsten rat sarcoma viral (KRAS) activates the RAS–MAPK pathway, leading to an increase in the expression levels of the AS factor polypyrimidine tract-binding protein 1 (PTBP1), activated via transcription factor ELK1. In turn, increased PTBP1 levels induce a shift in the AS of tumor-associated transcripts, namely, the small GTPase Ras-related C3 botulinum toxin substrate 1 (RAC1), adaptor protein NUMB, and PKM [49][54]. In addition to transcriptional stimulation of PTBP1 downstream of RAS, ERK was reported to phosphorylate the splicing factor SAM68, thereby inducing the binding of phospho-SAM68 to the 3′UTR of the SRSF1 transcript [50][55]. This binding promotes the retention of an intron required for production of full-length SRSF1 and prevents the downregulation of SRSF1 transcripts through the NMD pathway. Consequently, the increased SRSF1 levels, comparable in effect to the above described SRSF1 gene amplifications [39][44], induce a switch in AS of the RON gene transcripts, favoring the production of the oncogenic isoform RONΔex11. Phosphorylated SAM68 further stimulates inclusion of the variable exon 5 sequence into the CD44 mRNA, generating a pro-invasive cell adhesion protein variant [51][56].
Another MAPK pathway responds when cells experience physiologic stress. Osmotic stress triggers the MKK(3/6)-signaling cascade, leading to p38-activation, which upon nuclear translocation induces hnRNP A1 phosphorylation, followed by its export into the cytoplasm [52][53][57,58]. The corresponding decrease in nuclear splicing factor abundance is sufficient to change AS patterns. The PI3K/AKT signaling is another key pathway involved in cell survival and escape from apoptosis in numerous solid tumors. In non-small cell lung cancers (NSCLC), it was demonstrated that the activation of the PI3K/AKT pathway by oncogenic factors mediates the exclusion of the exon 3,4,5,6 cassette of CASP9 transcripts’ via the phosphorylation state of SRSF1, thus generating the anti-apoptotic Casp-9b isoform [54][59]. At the same time, AKT-mediated phosphorylation of hnRNPL induces its binding to a splice silencer element in Casp-9 pre-mRNA, further enhancing the exclusion of the exon cassette [55][56][60,61]. AKT activation also leads to phosphorylation and nuclear translocation of SR proteins, causing alternative exon inclusion in the fibronectin pre-mRNA [57][62]. Interestingly, in colorectal cells, inhibition of PI3K/AKT signaling led to increased expression of endogenous SRSF1, leading to the inclusion of an alternative exon, termed 3b, in the mRNA of the small GTPase RAC1, which generates the pro-tumorigenic splice variant RAC1B [58][63]. Later, it was described that SRPK1 and GSK3β act upstream of SRSF1, and are required to sustain RAC1B splicing in colorectal cancer (CRC) cells [59][64]. Particularly, it was shown that GSK3β indirectly regulates the levels of SRSF1 and RAC1B via SRPK1, since its depletion leads to a reduction of SRPK1 activity towards SRSF1, and a concomitant decrease in nuclear SRSF1 levels, resulting in less RAC1B generated. Another central hub of oncogenic signaling is the Wnt pathway, which is activated in many colorectal tumors. Remarkably, this pathway also modulates RAC1B splicing in CRC cells: It was described that the SRSF3 gene encoding splicing factor SRSF3/SRp20 is a transcriptional target for activated β-catenin/TCF4 complexes, leading to increased SRSF3 protein levels [60][65]. In a subsequent work, it was demonstrated that increased SRSF3 transcription following activation of the β-catenin/TCF4 pathway suppresses RAC1B splicing through SRSF3-mediated exclusion of exon 3b from the RAC1 mRNA [58][63]. Together, these examples show how signaling mechanisms affect alternative pre-mRNA splicing and change tumor-related gene expression.
Several splice variants have been associated with different hallmarks of cancer, including initiation, progression, and metastasis. In Table 1, we highlight some of the most relevant AS events in cancer-associated genes involved in different steps of oncogenic transformation, as well as the types of cancer they are most often associated with. Other examples were listed in a recent review [61][66].
Table 1. Tumor-associated AS variants and the respective cancer-promoting process.
Gene | Splicing Event | Biological Function | Cancer Types | References | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
BCL2L1 | 5′ alternative splice site usage in exon 2 | Bcl-xL inhibits apoptosis | Lymphoma, glioma, breast, prostate, and liver cancer | [62][63] | 68 | [64] | ,69 | [65] | ,70 | [66] | [67,,71] |
MKNK2 | Skipping of exon 14a and inclusion of exon 14b | MNK2b acts p38-MAPK-independent and promotes cell growth | Breast, colon, and lung cancer | [39][67][68] | [44,72,73] | ||||||
PKM | Skipping of exon 9 and inclusion of exon 10 | PKM2 stimulates aerobic glycolysis | Ovarian, gastric, liver, and colon cancer | [69][70][71][72] | [74,75,76,77] | ||||||
MST1R (RON) | Skipping of exon 11 | RONΔex11 induces cell motility and invasion | Colon, ovarian, brain, lung, and gastric cancer | [73][74][75][76][77] | [78,79,80,81,82] | ||||||
RPS6KB1 | Inclusion of three cassette exons 6a, 6b, and 6c with a PTC in exon 6c | RPS6KB1-2 promotes cell proliferation and tumor growth | Breast and lung cancer | [78][79] | [83,84] | ||||||
CCND1 | 5′ alternative splice site usage in exon 4 introduces a PTC | Cyclin D1b induces invasion and metastasis | Breast, lung, and prostate cancer | [80][81][82] | [85,86,87] | ||||||
VEGFA | Alternative 3′ splice site in exon 8 | VEGFA165 has pro-angiogenic activity | Colon, prostate, renal, and skin cancer | [83][84][85][86] | [88,89,90,91] | ||||||
CEACAM1 | Inclusion of exon 7 | CEACAM1-L accelerates metastasis progression | Colon cancer and metastatic melanoma | [87][88] | [92,93] | ||||||
CD44 | Inclusion of variable exon 6 | CD44-v6 induces migration and expression of mesenchymal markers | Colon cancer | [89][90][91] | [94,95,96] | ||||||
RAC1 | Inclusion of exon 3b | RAC1B increases cell survival and transformation | Colon, pancreas, thyroid, breast, and lung cancer | [58][92][93][94][95][96][97] | [63,97,98,99,100,101,102] | ||||||
EGFR | Skipping of exon 4 | de4-EGFR promotes malignant transformation as constitutively active receptor variant | Glioma, prostate, and ovarian cancer | [98][99][] | [103,104 | 100 | ,105] | ||||
KLF6 | 5′ alternative splice site usage in exon 2 | KLF6-SV1 lacks nuclear localization and contributes to mesenchymal phenotype | Breast, lung, pancreatic, prostate, and liver cancer | [101][102][103][104][105] | [106,107,108,109,110] | ||||||
CTTN | Inclusion of exon 11 | Cortactin isoform-a increases cell migration | Colorectal cancer | [106] | [111] | ||||||
FAK | Deletion of exon 26 | The −26-exon FAK isoform is caspase-resistant and inhibits apoptosis | Breast cancer | [107] | [112] |
The listed genes are B-cell CLL/lymphoma 2-like 1 (BCL2L1), MAPK interacting serine/threonine kinase 2 (MKNK2), pyruvate kinase M (PKM), macrophage stimulating 1 receptor (MST1R), ribosomal protein S6 kinase B1 (RPS6KB1), cyclin D1 (CCND1), vascular endothelial growth factor A (VEGFA), CEA cell adhesion molecule 1 (CEACAM1), clusters of differentiation 44 (CD44), ras-related C3 botulinum toxin substrate 1 (RAC1), epidermal growth factor receptor (EGFR), Krüppel-like factor 6 (KLF6), cortactin (CTTN), and focal adhesion kinase (FAK); (PTC—premature termination codon).