Colorectal cancer (CRC) is the second cause of cancer-related deaths in both sexes globally and presents different clinical outcomes that are described by a range of genomic and epigenomic alterations. Despite the advancements in CRC screening plans and treatment strategies, the prognosis of CRC is dismal. In the last two decades, molecular biomarkers predictive of prognosis have been identified in CRC, although biomarkers predictive of treatment response are only available for specific biological drugs used in stage IV CRC. The CRC staging system is based on the TNM (T—primary tumor, N—regional lymph nodes’ status, and M—distant metastases) staging system.
1. TNM Staging, a Pathological Evaluation of CRC
C
olorectal cancer (CRC
) is a molecularly heterogeneous malignancy with various clinical outcomes. This characteristic makes the CRC classification more challenging
[6][1]. The management of CRC usually still basically relies on the tumor location, and the stage of disease is defined according to the American Joint Committee on Cancer (AJCC), with TNM staging (tumor, nodes, metastasis) as the main predictive factor of prognosis
[7,8,9][2][3][4]. The TNM staging provides information about the prognosis of the patient’s disease as a post-surgical stratification based on the pathologist’s evaluation of the resected tumor, assists in decision making for the adjuvant treatment, if needed, and confirms the presence/absence of metastasis at the time of diagnosis. The main failing point of TNM classification is the inability to discriminate between “good” and “poor” cancer prognosis within the same stage. Inadequate prediction of the exact outcome of patients is concerning, particularly in stages II and III, because nearly 20% of patients defined in stage II may still die due to recurrence
[10][5]. Patients detected within the same TNM stage may also experience different outcomes in relation to various genotypic and phenotypic differences that exist in CRC patients
[11][6]. Accordingly, unnecessary treatment or over-treatment may occur through inaccurate discrimination of the disease. The TNM system, widely used in cancer staging, places considerable emphasis on the lymph node status. This aspect has been the subject of significant controversies, given the imprecise nature of the system. Consequently, researchers are actively exploring alternative or supplementary features that could enhance the precision of cancer staging
[10][5]. Therefore, refinements of biological parameters to improve the stratification of patients for tumor treatment or surveillance strategies are under investigation.
2. Molecular Staging, Genetic and Epigenetic Characteristics of CRC
The molecular subtyping application of CRC as a heterogeneous disease is potentially beneficial
[12][7]. Molecular classifications based on genetic and epigenetic characteristics in CRC patients rely on mutations, microsatellite instability (MSI), CpG island methylator phenotype (CIMP), chromosomal instability (CIN), copy-number deviations (SCNA), and significant pathways that affect CRC initiation and progression, such as WNT and MYC, which are employed for CRC stratification
[8,13][3][8].
Differences in tumor biology are reflected by specific supervised expression signatures
[11][6]. Several expression-based assays (e.g., Oncotype DX Colon, ColoPrint, and ColDx/GeneFX) with prognostic value for CRC are commercially available, but none of these signatures are yet validated and recommended for clinical use
[5,14][9][10].
Developments in genomic, transcriptomic, and big-data technologies enable investigators to explore the molecular characteristics of tumors and define their clinical relevance. The integrative omics analyses reveal the potential of incorporating different biological levels in which the transcriptome may play a valuable role
[14][10].
These Witems are discuss
these items ied in more detail below.
- (i)
-
CRC development from benign to malignant lesions, induced by some key driver genes, acquires a series of mutations over time. Among the key driver genes that play role in carcinogenesis, the adenomatous polyposis coli (APC), accompanied by its mutations, regulates growth advantages in epithelial cells and results in the formation of a small adenoma. Later, the generation of mutations in
KRAS and
BRAF provides a second round of expansion for cells, involving transformation to a large adenoma. Finally, the occurrence of
PIK3CA,
SMAD4, and
p53 mutations develops a malignant tumor that has the potential for invasion and metastasis. It remains unclear which mutations of these key driver genes are involved in the metastasis of CRC
[15][11].
-
- (ii)
-
MSI is caused by mutations in DNA mismatch repair genes (
MLH1,
MSH2,
MSH6, and
PMS2) and
EPCAM. The occurrence of mutations in
MLH1 or
MSH2 genes leads to an increased risk (70–80%) of developing cancer, while mutations in the
MSH6 or
PMS2 genes have a comparatively lower risk (25–60%) of cancer development
[16][12]. Nearly 15–20% of primary CRCs have the MSI phenotype, whereas the remainders are microsatellite stable (MSS). MSI tumors demonstrate multiple single nucleotide variants (SNVs) and insertion/deletions (indels). The five microsatellite markers were used as a standard panel to access the MSI status of cancer, including BAT26 and BAT25 as mononucleotide repeats, plus three dinucleotide repeats (D2S123, D5S346, and D17S250), as defined by the Bethesda Guidelines
[17][13]. A subset of tumors with unstable loci in ≥30% markers are defined as “microsatellite high” (MSI-H), a subset of tumors with 10–29% unstable loci are classified as “microsatellite low” (MSI-L), and “microsatellite stable” (MSS) tumors are marked with no unstable markers
[16][12]. In a study conducted by Mori et al.
[18][14], a comprehensive genomic screening of microsatellite coding regions was performed, revealing the presence of mutations in nine loci (TGF-βR2, Bax, MSH3, ActRIIB, SEC63, AIM2, NADH–ubiquinone oxidoreductase, COBLL1, and EBP1) in over 20% of tumors. It has been observed that the presence of MSI-H in CRC is associated with superior anti-tumor immune response, inhibition of tumor cell growth, and an improved prognosis when compared to patients displaying MSI-L or MSS status. MSI status is a conceivable predictor for the treatment decision strategy method in MSI-H and MSI-L tumors. Furthermore, MSI tumors are more frequently located in the proximal colon, and present as more poorly differentiated cancers
[19][15]. MSI is rarely found in polyps, except in Lynch syndrome, due to germline mutations in one of the MMR genes. MSI is the hallmark of HNPCC or Lynch syndrome and occurs in >95% of HNPCC cases
[16][12].
-
- (iii)
-
CIN nominated as the suppressor pathway phenotype is observed in 70–85% of CRC tumors and is often considered equal to MSS status
[20][16]. CIN is induced by the occasional gain or loss of the whole chromosome during mitosis. Accordingly, CIN tumors are often aneuploid with structural or numerical aberrations. Furthermore, numerous significant events contribute to the development of the CIN, such as encompassing mutations in some oncogenes, such as
APC,
KRAS,
TP53,
CTNNB1, and
PIK3CA, and loss of heterozygosity (LOH) chromosome in 18q with some of the tumor suppressor genes, such as
SMAD2,
SMAD4, and
DCC, in the location. Traditionally, CRC develops through the adenoma to carcinoma pathway.
APC gene mutation and inactivation are early events that are followed by the activation of
KRAS as an oncogene due to mutation appearance in the adenomatous stage, followed by the inactivation of
TP53 on chromosome 17p and the deletion of chromosome 18q, leading to metastatic carcinoma
[21][17].
APC mutations are present at the preliminary stages of neoplasia and are majorly linked with the classic tubular adenoma pathway and CIN cancers
[22][18].
-
- (iv)
-
CIMP is an epigenetic event that has been observed to precede the onset of cancer. The process involves an increase in methylation levels within the promoter region, which can lead to the silencing of tumor-suppressor genes. Conversely, global hypomethylation has been linked to genomic instability and chromosomal abnormalities. In CRC, epigenetic instability manifests as hypermethylation of CpG islands, often in tandem with global DNA hypomethylation. In CIMP-positive CRC, promoter regions of tumor suppressor genes are frequently hypermethylated, resulting in loss-of-function in these genes.
-
CIMP was described by Toyota et al.
[23][19] as a tumorigenesis pathway in 1999. Furthermore, CIMP tumors coincide to a considerable extent with MSI. CIMP is characterized by high promoter methylation of various genes, such as
p16,
MINT clones,
THBS, and
MLH1, and is correlated with some significant clinical, pathological, and molecular features, such as female sex, old age, right-sided tumors, high MSI, and
BRAF V600E mutations
[24,25][20][21]. CIMP is a subclassification that is determined by the integration of genetic and epigenetic instability, and it is further divided into two categories: CIMP-low and CIMP-high. Analysis of DNA methylation profiles has demonstrated that roughly 20% of CRCs fall into the CIMP category
[16][12].
3. Supervised Gene Expression Profiles for Prognosis Prediction of Early Stages CRC
Gene expression profiles (GEP) aim to predict the likelihood of tumor recurrence after surgery in the early stages of CRC and could also provide predictive information about the benefit of pharmacological treatment
[26][22]. The commercially available gene expression profile assays include the following (
Table 1):
-
The Oncotype DX colon cancer assay (Genomic Health, Redwood City, CA, USA,
http://www.genomichealth.com (accessed on 27 February 2023) predicts recurrence in stage II colon cancer patients after surgical resection
[27,28][23][24]. Oncotype DX includes a 12-gene expression assay (7 cancer-related—
BGN,
C-MYC,
FAP,
GADD45B,
INHBA,
Ki-67,
MYBL2—and 5 reference genes—
ATP5E,
GPX1,
PGK1,
UBB,
VDAC2 [29][25]—based on reverse transcriptase-polymerase chain reaction (RT-PCR). It has been tested on archival formalin-fixed and paraffin-embedded (FFPE) tumor tissue specimens in the QUASAR trial
[14][10].
-
Coloprint (Agendia, Amsterdam, The Netherlands,
http://www.agendia.com (accessed on 27 February 2023) is a test based on an 18-mRNA signature that predicts CRC relapse in the early stages of the disease. Furthermore, Coloprint uses whole-genome expression data analysis and has been validated in several independent cohorts. In addition, ColoPrint can predict the development of distant metastasis in stage II CRC patients and facilitates treatment strategy decisions for patients who may be safely managed without chemotherapy
[30]
Available gene expression assays predictive of prognosis in early-stage CRC.
4. The Consensus Molecular Subtyping as a Transcriptome-Based Staging
In recent years, several unsupervised gene expression-based classifications have been obtained due to the advancements in sequencing methods. Despite the availability of such classifications, their clinical utility has still not been recognized due to technical problems (e.g., the lack of standard methodology, different data processing, and diverse gene expression values, and practical reasons). In 2015, Guinney et al. integrated previously available classifications to obtain a distinct classification of CRC, and, thus, tried to eliminate the discrepancies between previous subtyping systems
[41][37]. The CRC subtyping consortium (CRCSC) normalized and integrated data from 6 previously available CRC subtyping classifications for a total of approximately 4000 primary tumors from 18 CRC datasets, by obtaining 4 distinct subtypes of CRC (CMS1–4) characterized by different prognoses, i.e., the “consensus molecular subtype (CMS)”
[41][37] (
Table 2).
Table 2.
The Consensus molecular subtype characteristics.
-
A further prognostic gene expression microarray-based assay, ColDx, commercially administered as GeneFx Colon, was developed by Almac (Almac Group Ltd., Craigavon, UK)
[31][27]. It is based on a 634-gene signature panel
[32][28] and performed on FFPE tumor samples
[33][29]. This assay differentiates stage II tumors into low- and high-risk for disease recurrence
[14][10].
Despite the availability of these platforms, their use in the clinic is not currently recommended by international guidelines due to unclear clinical utility for risk stratification and a lack of strong validation in predicting treatment benefits. In contrast, there are contradictions among gene expression-based CRC classifications that need to be resolved to correlate cancer cell phenotypic features with clinical behavior and guide targeted treatments.
For further information on Table 2 content refer to 10.3390/cancers15102746
CRCSC was established to assess the core subtypes’ fundamental molecular features of CRC among the previously available gene expression-based classifications and to merge all the accessible data sources, such as mutations, methylation status, copy-number variations, microRNAs, and proteomics, to examine whether CMS can be extensively utilized in clinical practice approach
[41][37]. Then, they determined the biological and molecular features of each subtype and ultimately assessed the prognostic and clinical association of the CMS subtypes. The CMS stratification named as a “gold standard” consists of CMS1 (immune subtype, 14%), CMS2 (canonical subtype, 37%), CMS3 (metabolic subtype, 13%), and CMS4 (mesenchymal subtype, 23%). Heterogeneous samples with mixed characteristics are classified as mixed or indeterminate samples (14%)
[41][37].
In terms of biological features, considering the genomic aberrations, CMS1 comprised the more significant number of MSI tumors that demonstrated the hypermethylation status. They were also hyper-mutated samples with a low prevalence of somatic copy-number alterations. Likewise, they had over-expression of proteins involved in DNA damage repair. CMS1 is the immune subtype with high expression levels of genes involved in the immune response. In CMS1, an enhanced expression of genes involved in the diffuse immune infiltration, comprising TH1 and cytotoxic T cells, accompanied the immune invasion pathways. This subtype is widely hypermethylated and has a low prevalence of SCNAs
[42][38].
CMS2 or “canonical subtype” was characterized by epithelial differentiation and has the highest distribution of tumors with high copy number counts in oncogenes and low copy number counts in tumor suppressor genes. Marked activation of WNT and MYC signaling, typical of CRC carcinogenesis was also present
[41][37].
CMS3 tumors were defined as CIN tumors with fewer SCNAs and higher CIMP status. In addition, almost 30% of CMS3 tumors were hyper-mutated that overlapped with other MSI phenotypes
[41][37]. The CMS3 or “metabolic subtype” indicates epithelial features and metabolic deregulation.
Ultimately, CMS4, or the “mesenchymal subtype”, comprises mesenchymal-like tumors with high stromal infiltration in combination with normal cells
[43][39]. A significant up-regulation of genes involved in epithelial–mesenchymal transition (EMT) and of signatures related to the TGβ signaling activation, angiogenesis, matrix remodeling pathways, and the complement inflammatory system has been observed in CMS4
[41][37].
CMS4 relates to poorer patient prognosis and poor response to anti-EGFR drugs and routine chemotherapy regimens. Additionally, CMS4 is correlated with worse disease-free and overall survival, with the highest tendency to develop distant metastasis among other subtypes
[44][40].
Overall, CMS2–4 have revealed an increased level of CIN
[41][37].
The integrative analysis of mutations and copy-number variations based on The Cancer Genome Atlas (TCGA) data showed that
BRAF mutations are frequently found in CMS1 and related to the MSI phenotype.
KRAS mutations are frequently seen in CMS3. Additionally, the receptor tyrosine kinase (RTK) and mitogen-activated protein kinase (MAPK) pathways are generally activated in CMS1 and CMS3. However, none of these genetic aberrations exclusively belong to specific CMS subtypes. The observed heterogeneity in CMS status among CRCs with commonly accepted driver events underscores the significant variability in the biological behavior of such tumors. Additionally, this highlights the poor correlation between genotype and phenotype in CRC
[41][37].
In a molecular analysis of the AGITG MAX clinical trial, there was no significant difference in the proportion of
RAS mutations across CMS groups, but the incidence of
BRAF V600E mutation was diagnosed with the highest proportion in CMS1 (34%) compared to CMS2 and CMS4 (each <2%). MSI was not common in this CMS sub-study and was found to be 7% in both CMS1 and CMS3. Finally, CMS1 included a high percentage (39%) of CIMP-high tumors in comparison with CMS2 (3%) and CMS4 (6%)
[45][41].
In association with the clinical variables, several findings have been revealed. The CMS1 group is generally right-sided, presents mainly in women with a greater histopathological grade, and patients in this subtype showed poor survival after recurrence. In contrast, CMS2 subtypes are mostly left-sided, with a higher survival rate after relapse. CMS4 tumors are mainly related to AJCC stages III and IV, and patients with CMS4 CRC display poorer overall and relapse-free survival. Consequently, the poor prognosis after relapse is relevant for patients with MSI and
BRAF mutations
[41][37]. Therefore, it has been suggested that during the time of transition from the adenoma stage to the advanced carcinoma phase, cells change from CMS1–3 to CMS4
[46][42]. Most peritoneal metastases have been observed in CMS4 due to colorectal peritoneal carcinomatosis and adverse histopathological features. These observations comprised a high stroma level in both primary and metastatic tumors, inadequate differentiation grade, and increased tumor budding in primary tumors
[47][43].
Individuals with inflammatory bowel diseases (IBD) are at an increased risk for CRC development. The signaling pathways involved in colitis-associated cancer are believed to be similar to those observed in sporadic CRCs but with a distinct sequence of events. Recently, the molecular signature of these cancers has been identified in a study that shed light on the lack of CMS2 tumors among IBD-CRCs, which were, instead, skewed toward CMS4
[48][44].
Generally, it is believed that the CMSs of CRC may better inform clinicians about prognosis, treatment response, and potential novel therapeutic strategies
[41][37].