1. Introduction
Cancer-associated gene fusions, also known as oncofusions (OFs), are hybrid genes formed when two previously independent genes become juxtaposed (
Figure 1). The formation of these fusions can stem from structural rearrangements, transcription read-through of neighboring genes, or the
trans- and
cis-splicing of pre-mRNAs. Gene fusions with oncogenic properties are referred to as oncofusions, and often act as driver mutations in different cancers
[1]. The identification of the first oncofusion signifies a landmark in the study of cancer
[2]. Since then, developments in the applicable methods have enabled an accelerating pace of discovery of new oncofusions in cancer samples, with the pace skyrocketing after massively parallel sequencing (MPS) became viable. Oncofusions have been identified extensively in solid tumors, leukemias, and lymphomas, and in both singular studies
[3] and systemic sequencing studies
[4][5].
Figure 1. Chromosomal instability produces gene fusions.
Top panel: chromosomal rearrangements leading to fusion mutations.
Bottom panel: types of gene fusions. Depending on the specific fusion sites in both genes, the result can be either a hybrid gene fusion or a promoter- or enhancer-hijacking fusion. In both cases, the produced proteins can be either in-frame or out-of-frame. Images were created using
BioRender.com.
Oncofusions drive cancer development primarily through two mechanisms: deregulation and the creation of hybrid genes. Deregulation occurs when one gene becomes linked to another’s regulatory region, typically resulting in the overexpression of an otherwise normal gene. The first conclusive evidence for the deregulation mechanism was provided by analyses of Burkitt lymphoma (BL) and chronic myeloid leukemia (CML). These oncofusions link the coding region of the
MYC oncogene to the regulatory region of immunoglobulin, leading to
MYC overexpression and subsequently oncogenesis
[6][7][8]. The second oncogenic mechanism involves the fusion of the coding regions of two different genes to generate a hybrid gene, resulting in a functional chimeric protein such as
BCR::ABL [2]. This oncofusion results in a fully functional protein with aberrant kinase activity, and is the best-described model example of abnormal protein function resulting from an oncofusion
[9][10][11][12][13][14]. Indeed, oncofusions are known to disturb multiple critical cellular signaling pathways regulating proliferation, differentiation, and survival (
Figure 2).
Figure 2. Hybrid oncofusions disrupt many receptor-dependent signaling pathways, whether downstream or upstream. Common fusion-affected pathways are highlighted along with well-known fusions affecting the corresponding pathways. BA: BCR::ABL, E6N3: ETV6::NTRK3, NA: NPM::ALK, T2E: TMPRESS2::ERG. Image was created using
BioRender.com.
While oncofusions are typically somatic mutations, a few hereditary exceptions have been described
[15]. Guided genomic methods such as quantitative real-time PCR (qPCR) and fluorescent in situ hybridization (FISH), together with extensive genomic research efforts like the Cancer Genome Atlas Project (TCGA) and the more recent Japanese Cancer Genome Atlas (JCGA), have contributed significantly to the identification of tens of thousands of oncofusions since the 1960s and 1970s
[3][4][5]. The TCGA project has been particularly influential in shedding light on the scale, prevalence, and impact of oncofusions. The project propelled forward the development of both identification strategies and treatment options by molecularly characterizing over 20,000 cancer samples from 33 different cancer types.
The combined knowledge of all known oncofusions supports the significance of deregulation and hybrid gene formation in fusion oncogenicity, although other mechanisms may also play a role. Moreover, some oncofusions can drive cancer development via the inactivation of cancer suppressor genes, for example by coupling them to regulatory elements of genes that are not usually expressed
[16][17]. The major challenge with vast numbers of possible, probable, and confirmed oncofusions is validation and relevance: many could be functionally irrelevant or barely expressed passenger mutations, and most have been described as random mutations arising from chance events
[18]. However, as several studies highlight, non-driver mutations can still be very valuable for cancer treatment due to their role in producing neoantigens
[19][20].
2. Oncofusion Formation
Integration of transcriptomic and genomic data has estimated that approximately two thirds of fusion mutations stem from erroneous repair of DNA DSBs
[18]. Erroneous DNA DSB repair can lead to genome instability, inducing mutagenesis and genomic rearrangements such as deletions, inversions, duplications, and translocations. All of these genomic rearrangements can result in the formation of oncofusions
[21].
Although the majority of fusions transpire in “gene-rich” areas of the genome
[18], intragenic fusions have been assessed to constitute only 38% of fusion events
[22]. These involve the joining of one gene’s coding sequence with the coding sequence of another, resulting in an in-frame gene fusion such as
BCR::ABL1 or
TMPRSS::ERG. A large proportion of oncofusions have intergenic regions from at least one fusion partner (
Figure 3A), but can sometimes still produce proteins where the 3′ fusion partner is also in-frame
[22][23]. Some intergenic oncofusions are the result of a gene with an otherwise weak promoter fusing to a stronger promoter or enhancer (e.g.,
IGH-MYC in BL)
[24]. These events are known as promoter or enhancer swapping
[22].
Figure 3. General assessment of fusions from tumorfusions.org database. (A) Count of fusions based on frame prediction. (B) Fusion distribution based on gene type. N/A denotes genes which could not be categorized definitively as protein kinases, transcription factors, tumor suppressors, or other oncogenes. (C) Distribution of fusions based on frame prediction and gene type. Only fusions that could be assigned to one of the four gene groups are included.
For the purpose of a general assessment of fusions and participating genes, researchers downloaded data from tumorfusions.org. Although tumorfusions.org data only reflect data from TCGA and different analyses can produce different results, these data were deemed sufficiently accurate for an overall representation of oncofusions. Researchers conducted an analysis of data downloaded from tumorfusions.org, focusing on key genes participating in fusions. This dataset was then cross-verified and annotated with information from the COSMIC, ChimerDB, and Uniprot databases. The genes were subsequently categorized into four main groups for an in-depth analysis: kinases, transcription factors, non-kinase or TF tumor suppressors, and other oncogenes. The majority of genes participating in fusions, such as VMP1 and TACC3, do not belong to any of these groups (Figure 3B). Of the annotated genes, in-frame fusions, kinases, and transcription factors are equally common, and thus it is no surprise that much research has focused on them in recent years (Figure 3C).
On the kinase side, the majority of the in-frame fusions originated from combinations of FGFR3, BRAF, RET, NF1, and FGFR2, with 36, 33, 28, 18, and 18 fusions, respectively, with a variety of partners (FGFR3 combined in particular with TACC3) (Figure 4A). A group that has often been identified as a major contributor to fusions, as well as cancers driven by other mutations, is the RTKs, especially the FGFR and NTRK subfamilies, as well as RET and ALK.
Figure 4. Representation of recurring fusions from tumorfusions.org data. (A) Count of recurring and non-recurring fusions per gene in each gene type category. Included are genes present in at least 5 recurring fusions. (B) Scatter plot of all in-frame fusions. 5 prime genes are on the x axis and 3 prime genes are on the y axis. Size and color of the bubbles correspond to the number of fusions in the tumorfusions.org dataset. Gene pairs with 5 or more recurring fusions identified are highlighted. Inset: highlighted fusions in bar plot format.
While kinases hold significant attention in oncofusion research, transcription factors also play a considerable role. For instance, the most common genes involved in oncofusions were ERG, RARA, TFE3, and ETV6, with 65, 28, 18, and 17 fusions, respectively. The group in general was driven by the TMPRSS2::ERG fusion, which was overall the most common fusion found, with 60 in-frame fusions identified in the tumorfusions.org TCGA dataset.
Genomic rearrangements are present in roughly half of hematopoietic cancers as well as up to 90% of solid tumors
[25], many of which harbor oncofusions
[26]. The products of these oncofusions play pivotal roles in tumor evolution and progression
[27][28].
3. Oncofusions’ Role in Cancer Development
Cancer development has historically been viewed as a slow accumulation of mutations which increases genomic instability, ultimately resulting in both driver and passenger mutations
[29]. Instability takes place at the nucleotide level, causing small-scale changes such as point mutations and minor deletions, as well as at the chromosomal level, prompting more substantial rearrangements of entire chromosomal segments
[30].
Following the first large-scale identification of oncofusions in MPS studies, it was estimated in 2007 that fusions accounted for roughly 20% of cancer morbidity and represented early events in cancer genesis
[28]. More recently, during the JCGA project, known oncofusions from a limited panel of 491 were identified as driver mutations in 12.9% of cancer specimens
[5]. Identification varies by cancer type. It is noteworthy that the prevalence of fusions is significant in specific cancers—they are identified in over 90% of all lymphomas
[31] and more than 30% of soft-tissue tumors
[32]. Other research has suggested that fusions drive 16.5% of human cancers and act as solitary drivers in 1%
[33]. These estimates are likely to be reasonably accurate.
Prominent Oncofusions in Cancer
Various kinase and transcription factor fusions are frequently emphasized as driver oncofusions
[34][35][36][37][38][39][40][41][42]. In the case of transcription factor fusions, prevalent mechanisms of cancer formation revolve around the DNA-binding domain. Kinase oncofusions often lead to the loss of kinase domain regulation and the acquisition of an alternative dimerization mechanism, such as the coiled-coil domain of
TACC3 in the
FGFR3::TACC3 oncofusion
[43].
In addition to gene groups, specific oncofusions have been highlighted as crucial early steps in the initiation of tumorigenesis, or as crucial contributors to tumor morbidity
[44][45][46][47]. Furthermore, many oncogenes seem to require the occurrence of specific fusion events to unleash their oncogenic potential. In such instances, there is minimal variability in how the fusion occurs, and they are frequently recognized as recurrent oncofusions in large-scale studies. For example, the oncogenic potential of the
IGH::BCL2 oncofusion is derived from the combination of the anti-apoptotic
BCL2 with the highly expressed immunoglobulin locus of
IGH [48][49]. A similar mechanism is often present in prostate cancer, where the
TMPRSS2::ERG oncofusion is found in roughly 50% of tumors and
MAN2A1::FER,
MTOR::TP52BP1, and
SLC45A2::AMACR occur at lower frequencies. The common factor uniting these fusions is the association of an oncogene with a more active promoter region
[50][51][52][53].