More recently, large-scale comprehensive genomic studies including single-cell RNA sequencing and characterization have revealed multiple processes by which protein-coding and noncoding RNA processing are dysregulated in many cancers. Among these, mutations that drive cancer by perturbing co-transcriptional and post-transcriptional regulation of gene expressions, such as alterations that affect each phase of RNA processing, including the transcription, splicing, transport, editing, and decay of protein-coding and noncoding RNAs, including microRNAs (miRNAs), have been implicated in the pathogenesis of many cancers.
2. Basic Mechanisms of Alternative Splicing Regulation
Alternative splicing relies on the distinction between intronic and exonic sections of DNA within genes. The pre-processed mRNA transcript bears these same sections, which are recognized and spliced together by the spliceosome, a large complex of five small nuclear ribonucleoproteins (snRNPs) and proteins
[7][8]. Specific consensus sequences such as 5′ dinucleotide GU and 3′ dinucleotide AG in introns are critical to intron recognition. In brief, actual splicing involves an enzyme-assisted lariat formation through attack of the 5′ splice site (SS) phosphodiester by the 2′ OH on a specific adenosine residue contained within the intron approximately 18–40 nucleotides upstream of the 3′ SS
[8][9]. The freed 3′ OH of the 5′ SS then is able to attack the 3′ SS phosphodiesterase, leading to exon ligation and lariat release
[8][9].
Another considerable layer of complexity arises when considering the propensity for a common gene to be spliced in different ways in different cells or even within the same cell, with varying exon inclusions and splicing
[9][10]. While much remains to be learned about the regulatory mechanisms involved in this, a few have been uncovered.
The first of these are cis-acting elements along the pre-mRNA, which represent regulatory sequences facilitating everything from protein interaction with the pre-mRNA to folding and the three-dimensional structure of the molecule
[10][11]. SSs themselves, in fact, fall under this category, setting the field initially to lay out the options upon which the spliceosome machinery can act. SS properties depend not only on the sites themselves, which remain highly conserved regions of the genome, but also on the surrounding sequences which have been found to increase or attenuate the binding interaction between recognition spliceosome snRNP U1 and the site
[11][12]. This effect in fact allows the classification of SSs as “strong” or “weak”, with weak SSs typically flanking alternatively spliced exons (as opposed to constitutively spliced exons)
[11][12].
Certain sequences on the pre-mRNA additionally can serve as binding sites for trans-acting RNA-binding proteins (RBPs), allowing for a higher level of granularity in promoting or inhibiting certain splicing events
[12][13]. Modulated accessibility to RBPs or even the spliceosome itself through pre-mRNA folding has also been shown to have a significant regulatory effect, and conversely, RBPs may act directly by altering the structure of the pre-mRNA to promote or inhibit favorable spliceosome–SS interactions
[13][14][14,15].
3. Functions of Alternative Splicing in Gastrointestinal Malignancies
In fact, work on profiling alternative splicing in cancers, especially gastrointestinal malignancies, has seen a dramatic rise over recent years, uncovering a host of previously unknown disease mechanisms.
A different hnRNP, hnRNP K, has been implicated, along with SR RBPs SRSF1 and SRSF2 and hnRNP A1, in apoptotic dysregulation in pancreatic and liver cancers through the dysregulation of a host of target genes involved in both extrinsic and intrinsic apoptosis (including Fas, caspase-8, and caspase-9) as well as anti-apoptotic factors Bcl-x and Mcl-1. These RBPs are being currently explored for therapeutic targeting
[15][52]. Overexpression of Linc01232 in pancreatic cancers (PCs) leads to the inhibition of hnRNP A2/B1 ubiquitination and degradation, leading to AS of A-Raf and thus dysregulation of MAPK/ERK signaling driving tumor progression
[16][53]. Similar to CRC, pancreatic ductal adenocarcinoma (PDAC) has also been shown to favor the PKM2 isoform over PKM1 secondary to PTBP1 upregulation and increased incidence of PTBP1 pre-mRNA binding, particularly in drug-resistant PDAC (DR-PDAC)
[17][54]. In the same study, knockdown of PTBP1 in vitro was shown to decrease PKM2 levels and sensitize cells to drug treatment
[17][54]. PCs are also observed to be high in microRNA miR-193a-5p, linked to the disruption of AS through the targeting of splicing factors
[18][55]. Specifically, miR-193a-5p overexpression is hypothesized to target SRSF6, leading to AS of OGDHL and ECM1, driving epithelial–mesenchymal transition (EMT) and increasing metastatic events
[19][56]. Beyond specific molecular targets, a 2020 study of 177 patient PCs found the overall AS signature to have significant predictive prognostic power
[20][57].
Among the discovery of similar disease mechanisms involving mutations of trans-factors such as SR proteins or upstream regulators of these factors, a study of alternative splicing in gastric cancers (GCs) recently led to the discovery of the importance of a class of circular noncoding RNAs (circRNAs). While these molecules have been known since as early as 1976, certain effects on alternative splicing promoting tumorigenesis have been described in recent literature. For one, the biogenesis of circRNAs is through the AS process, and therefore competes with normal AS
[21][58]. Moreover, because circRNAs are noncoding, in addition to competing for splicing machinery, the biogenesis of a circRNA disables a potentially coding pre-mRNA, directly regulating gene expression
[21][58]. Other effects, such as the regulatory activity of circRNAs on RNAP II leading to a change in transcriptional environment or association with snRNPs involved in the spliceosome, also affect the dynamics of AS, suggesting a possible disease mechanism explaining observed abnormal circRNA levels in GCs
[22][59]. It should be emphasized, however, that general disease mechanisms related to circRNA are plentiful and its role with AS is only one of these pathways.
Disease mechanisms in other GI system cancers bear overall similarity to those described previously in CRC, PCs, and GCs, with individual efforts underway to profile the AS landscape and establish predictive links between AS events and prognosis. One such effort with hepatocellular carcinoma (HCC) in 2020, for instance, has identified over 3000 candidate AS events associated with almost 400 splicing factors, ultimately producing a predictive model for prognosis and metastatic potential
[23][60]. The authors found, in particular, a strong correlation between YBX3 and prognosis as well as metastasis, proposing a mechanism through ABCA6 and PLIN5 and its effects on the primary bile acid biosynthesis pathway
[23][60]. Esophageal squamous cell carcinoma (ESCC) analysis has revealed the role of long intergenic noncoding RNA (lincRNA) uc002yug.2 in carcinogenesis, particularly through the modulation of the nuclear AS environment to favor the RUNX1 isoform RUNX1a and reduce CEBPα, an event found to have predictive potential over prognoses in ESCC patients
[24][61]. Interestingly, literature on gallbladder cancer (GBC)-related AS events is far sparser in comparison to other GI malignancies. However, circRNA, particularly circERBB2 overexpression, has been implicated in poor GBC prognoses and may provide a clue as to pathological AS events in such cancers in a similar manner as to GC, though as previously mentioned the broad scope of circRNA functions makes it difficult to narrow its impact to AS specifically
[25][48].
Nevertheless, the role of AS on carcinogenesis and GI malignancies particularly is not to be understated. Promising preclinical work showing the therapeutic power of targeting aberrantly regulated players within this pathway suggests an emerging treatment strategy on the patient-facing front.
Two major classes of trans-acting RBPs are serine/arginine-rich proteins (SR proteins, often classed as “SRSF” for serine/arginine-rich splicing factor) and heterogeneous nuclear ribonucleoproteins (hnRNPs)
[26][27][16,17]. SR proteins typically work by directly recruiting the spliceosome snRNP U1 to the 5′ SS or by recruiting U2AF, an auxiliary splicing factor, to the 3′ SS, leading to overall splicing enhancement
[28][18]. In contrast, hnRNPs typically interact with intronic splicing silencer (ISS) motifs to avoid splicing at a specific SS
[29][19]. However, many exceptions to this generalization have been uncovered, and both SR proteins and hnRNPs have been shown to both positively and negatively regulate splicing through binding various pre-mRNA motifs and cooperative and competitive direct interaction
[29][30][31][32][33][34][19,20,21,22,23,24]. The phosphorylation of RBPs presents another means of modulating their activity and pre-mRNA binding effect
[35][25]. Dysregulation of SR proteins or hnRNPs is a frequently observed trait in many GI malignancies.
Tissue-specific RBP expression also plays an integral role in the regulation of alternative splicing. Direct interactive effects between RBPs as well as the interplay of cis-element type and positioning along with the pre-mRNA transcript, chemical regulation (such as through phosphorylation), and physical and structural realities within the cellular environment serve to create a unique regulatory environment in different cell types for alternative splicing
[36][26]. Such variance among different cell types allows alternative splicing to play a major contributory role in the determination of tissue identity and cell phenotype
[36][26].
Because human alternative splicing typically occurs alongside transcription, certain properties of the gene transcriptional environment can also regulate alternative splicing. This is partly determined by indirect effects, such as the impact of transcription rate on the three-dimensional folding of the pre-mRNA transcript. However, this same rate has also been shown to have an impact on SS recognition, with slower rates leading to increased splicing at weaker splice sites, for instance, and faster rates favoring splicing at strong splice sites instead
[10][11]. Such considerations have been termed the “kinetic model” of alternative splicing
[37][27].
A “recruitment model”, which encompasses the direct recruitment of RBPs and other factors to the splicing environment, is also involved in the regulation of alternative splicing by transcriptional dynamics. Direct interaction between RNAP II and splicing factors, for instance, has been proposed as one model of modulating the splicing environment
[38][28]. Moreover, this direct recruitment activity by RNAP II has been shown to affect transcription rates, as splicing machinery and various related factors are recruited to the pre-mRNA. As such, the relationship between transcription, splicing, timing, and present factors is incredibly dynamic, with many chances for cross-regulation and selectivity in determining the ultimate mRNA product to be translated
[39][29].
Epigenetic factors on DNA have also been shown to influence regulating alternative splicing. While some of this is due to their influence on previously discussed methods—nucleosome positioning, for example, has an impact on the transcriptional rate and can cause RNAP II pausing
[40][30]—interactions between splicing-related factors and epigenetic histone marks as well as nucleosomes themselves can contribute to the determination of which splicing factors are present for pre-mRNA processing and which are absent
[41][42][43][31,32,33]. It follows that factors involved in these epigenetic marks, such as HDAC or even DNA modification, also play a role in alternative splicing
[44][34]. DNA-binding proteins (DBPs) and influence over transcription (such as alternative reading frames) have also been seen to affect splicing, potentially through these same mechanisms, although work is still underway to more thoroughly explore this
[38][45][46][28,35,36].
Finally, the spliceosome itself has been proposed to have a regulatory function on alternative splicing. Different points of control include spliceosome formation, the concentration of snRNP isoforms, and perhaps most interesting, kinetic proofreading
[47][37]. (Indeed, snRNP differential expression among different tissues may be a clue as to the importance of this core regulatory function of the spliceosome
[48][38].) Kinetic proofreading involves spliceosome rejection of an initially recognized SS, mediated by downstream catalytic steps within splicing having the ability to cancel the overall process based on chemical timing (most often timing inherent ATPase activity against catalytic activity)
[49][39]. Overall, the ability for the spliceosome to self-regulate presents an interesting consideration within the larger discussion of alternative splicing and a tantalizing area for further work.