2. Obstacles and Limitations to the Use of Biomarkers as a Screening Tool
Many serum biomarkers that are currently used for screening, diagnosis, or monitoring the progression of various cancers suffer the same issues as described with those of CEA, in that they have limited specificity and as some can be associated with various inflammatory processes within the body. For example, CEA can be raised in diverticulitis and inflammatory bowel disease [
39]. CA19-9 (UniProtKB ID Q969X2), also known as carbohydrate antigen 19-9, is a tetra-saccharide that attaches to the O-glycans on the cell surface; it is commonly used as a tumor marker for pancreatic cancer [
71]. CA19-9, however, can also be raised in other malignancies such as liver, gallbladder, and CRC [
72]. CA19-9 may also be raised in benign inflammatory conditions of the biliary system such as hepatitis, cholecystitis, and obstructive jaundice [
73]. CA125 (UniProtKB ID Q8WX17), also known as mucin 16, is a glycoprotein within the mucin family, and when levels are raised above 35 units/mL, there is an 80% chance of the presence of ovarian cancer in a patient [
74]. However, it can also be raised in various benign inflammatory conditions such as endometriosis and pelvic inflammatory disease and so has no role in screening for ovarian cancer [
75]. Thus, finding a biomarker that is specific to only one type of cancer has proven difficult over time. CA19-9 can be raised in multiple malignancies, and both CA19-9 and CA125 can be raised in benign inflammatory conditions. This reduces their usefulness as screening tools for cancer. This issue is shared by other tumor markers that are commonly associated with various cancers and remains the main obstacle that is yet to be overcome in the discovery of a serum biomarker that can be used as an accurate diagnostic screening tool for cancer. Another representative example of a less than perfect marker is prostate-specific antigen (PSA) (UniProtKB P07288), which is a glycoprotein enzyme used as a biomarker for screening prostate cancer. Higher levels above 3 ng/mL indicate up to a 60% likelihood of having prostate cancer, whereas a normal result of less than 3 ng/mL confers an 85% probability of not having prostate cancer [
76]. PSA tests return many false-positive results and it is not a perfect screening tool for prostate cancer. Trauma to the prostate with the digital examination or urinary catheterization could also lead to a transient rise in PSA, fueling false-positive results [
77].
In the USA, the American Cancer Society (ACS) and the Centers for Disease Control and Prevention guidelines recommend screening for CRC via either a FIT stool test, a multi-targeted DNA stool test, or colonoscopy [
36,
78]. In Germany, France, and Denmark, screening is also carried out using the FIT stool test and colonoscopy [
79,
80,
81]. In the United Kingdom, no biomarkers have been approved for use as a screening tool to detect early CRC, which in many cases, will be asymptomatic. The National Institute of Clinical Excellence (NICE) and the Association of Coloproctology (ACPGBI) advocate for the use of the FIT test to screen patients who would need urgent endoscopic or radiological intervention [
82,
83]. Yet, no guidelines or approval have been given to any use of biomarkers in screening and early detection for CRC, and the use of CEA is limited to monitoring of treatment and observation of recurrent disease.
3. Current Advances in CRC Biomarker Detection
3.1. DNA-Based Molecular Markers and Tests
Several recent studies have looked at testing DNA in feces, looking specifically for biomarkers in cells originating from colonic neoplasms. These studies have concentrated in some instances specifically on methylated DNA in the stool. One such study used a combined stool FIT and multi-targeted stool DNA test (mt-DNA). The mt-DNA tests relied on quantitative real-time PCR of bisulfite-converted DNA for the detection of hypermethylated NDRG4 and BMP3 gene promoters, for KRAS gene mutations, and using β-actin as an internal DNA reference. Regression analysis was then used to combine these data with the results of the stool hemoglobin component to yield a composite score, which was compared to a traditional stool FIT test that has been established for use in CRC screening [
84]. The study involved nearly 10,000 patients, for whom the average risk of CRC was estimated. The stool DNA test was significantly better at detecting CRC than FIT (92.3% vs. 73.8%,
p = 0.002) and advanced precancerous lesions including advanced polyps (42.4% vs. 23.8%,
p < 0.001). However, there were more false-positive results than with FIT [
84] (
Figure below summarizes the study findings). The use of the mt-DNA test was approved for clinical use by the USFDA in 2014. A more recent retrospective cohort study [
85] confirmed the ability of the mt-SDBA test to detect early-stage cancers (18% tested positive, with fewer than 1% having colorectal cancer and 60% having adenomas), though there were high false-positive rates (39% deemed false-positive) [
85]. Other studies have also looked at testing stool for DNA, showing good potential for use in screening [
86,
87]. The multi-targeted stool DNA test has been adopted for use and forms part of the clinical guidelines for screening for CRC in the USA [
36,
78,
88].
Figure. Summary of the main results of a study comparing FIT stool test to methylated DNA stool test for the detection of colorectal cancer [
84].
Recently, another serum blood test was developed in the USA, named the Epi proColon
®. This is a serum blood test that with the aid of real-time PCR assays, detects the presence of methylated SEPT9 (mSEPT9), which is a known biomarker for CRC [
89]. The overall sensitivity of the test is relatively low at detecting CRC, at 68.2% across all stages, with a specificity of only 79.1% [
90]. That makes Epi proColon
® inferior to both colonoscopy and FIT. Furthermore, the Epi proColon
® test has a relatively high false-positive rate of around 12% and an overall poor sensitivity for precancerous adenoma lesions [
91]. The test has been approved by the FDA for use only in patients who refuse to partake in traditional screening and is not part of any clinical guidelines for the screening of CRC in the USA [
92]. The detection of mSEPT9 DNA can also be involved in other malignancies, including those of the urinary tract [
93], brain [
94], ovaries [
95], breasts [
96], and for leukemia [
97]. Being minimally invasive, generally acceptable, and easy for patients, the mSEPT9-based serum test (Epi proColon
®) has some advantages as a screening tool for CRC. From the patients’ perspective, Epi proColon
® provides a more appealing option and seems to be no different from other blood tests taken for any other reason, meaning some patients prefer this alternative to handling their stool samples for a FIT test. Such patients’ hesitance invariably leads to a lower engagement rate. The use of Epi proColon
® as an alternative testing procedure is better than not using any test, and therefore, the use of this test increases CRC screening rates and population coverage. However, given the relatively low specificity rate for ruling out CRC, and the lower sensitivity of the mSEPT9 test for early CRC stages, this test could not be used as a sole tool for CRC screening, but would need to be in conjunction with a detailed patient history and examination. Furthermore, patients with a negative test who still manifest symptoms akin to CRC, as well as patients with a positive mSEPT9 test, will still require endoscopic examination of the colon. Therefore, the existing versions of the Epi proColon
® mSEPT9 test cannot replace other existing tools as a sole screening tool for CRC detection. However, combining mSEPT9 with FIT or FOB does improve the diagnostic sensitivity, and in combination with colonoscopy, reduces CRC mortality.
Syndecan-2 (SDC2, UniProtKB ID P34741) is a transmembrane protein that is known to be involved in many cellular processes associated with carcinogenesis including cell proliferation, angiogenesis, and cell migration [
98]. Aberrant methylation of the SDC2 gene has also been shown to be involved in the pathogenesis of CRC. Its detection is possible from a tissue, blood, or stool sample, and the marker has been shown largely in late-stage III/IV disease [
99]. A serum blood test has been developed, which looks at the methylation of a combination of Sept9 and SDC2 for use in the early detection of CRC, which is still awaiting approval. Studies have shown promising results, with an overall sensitivity of up to 80% and specificity of 92% [
100]. Further improvements to the sensitivity of CRC detection with SDC2 methylation assays could be achieved by combined detection of hypermethylated TFPI2 and hypomethylated SDC2 [
99,
101]. Other DNA methylation markers linked to CRC include SFRP2 (obtainable from stool samples, with a sensitivity and specificity of 77%, [
102]), VIM (obtainable from the serum, with a sensitivity of detection of 36.1%, 45.2%, 55.4%, and 85.7% for CRC stages 1 to 4, respectively, when used in combination with traditional CEA analysis, [
103,
104]), FBN2, and TCERG1 (sensitivities of 86% and 99%, respectively, if detected from tumor tissue) [
105]. Over the last decade, many research publications have reported other promising methylated DNAs detectable in the blood as diagnostic, prognostic, and predictive markers of CRC (reviewed in [
106]). Whilst DNA methylation represents a phenomenon common to many cancers, is detectable using a modified PCR-based approach, and provides a more stable type of molecular marker compared to, e.g., circulating RNA, all such normally intracellular molecules require mechanical rupture of malignant cells following their necrosis or apoptosis. Therefore, the mSEPT9 test and any other similar future tests aiming to detect methylated ctDNA will inevitably have limited sensitivity to early-stage CRC and pre-cancerous states such as advanced adenomas.
3.2. Circulating Tumor Cells
Circulating tumor cells (CTCs) provide another promising avenue explored with the view of early detection of CRC. However, one limitation of using CTCs is their low abundance (of the order of 10
9 fewer than red blood cells) and the consequent need for their enrichment and capture, as well as their physical and biochemical heterogeneity [
107]. Whilst a wide range of methods relying on the physical properties of CTCs have been reported (density, size, deformability, electrophoretic properties), affinity-based capture, such as positive selection for EpCAM or negative selection for CD45, remain the preferred methods of CTC capture and enrichment [
108]. Among the advantages of relying on CTC analysis is the ability to generate insights into the complete transcriptome of individual CTCs, to better understand their unique molecular phenotypes and accurately identify their molecular pathological subtype [
109], chemoresistance, or metastatic progression [
110]. The avenue that has been explored thus far is linked to CTCs arising from KRAS gene mutations within CRC cells. These mutations occur in around 45% of all cases of CRC, and their detection can be achieved using, for example, digital droplet PCR (ddPCR) techniques on serum blood samples [
111]. KRAS mutations are important in CRC as it is one of the downstream effectors of the EGFR pathway, which is known to be involved in the pathogenesis of CRC [
112]. CTC detection provides a sensitivity of around 83% for the mutations found in the serum compared to those found in the actual tumor, meaning there is potential for future opportunities for early detection and monitoring of patients with CRC [
111].
3.3. microRNAs and Other Non-Coding RNAs
Along with DNA methylation, microRNAs (miRNA) represent one of the key existing epigenetic mechanisms responsible for the regulation of gene expression, and therefore, gene function. There has been vast interest in the potential use of miRNA markers in the early screening and diagnosis of CRC, although these remain in the early stages of trials. miRNAs are typically detected and quantified via reverse transcription-quantitative PCR (RT-qPCR), RNA sequencing (RNA-Seq), or using microarrays. Recent studies have shown that the overall sensitivities can be around 76%, with a similar specificity level, which means there is potential for future use in screening and early detection. There have been numerous candidate miRNAs including miR-21 [
113] and miR-23a [
114]. Another miRNA candidate, miR-378 has been found to affect signaling pathways that control processes such as cell proliferation and apoptosis, specifically in stage II CRC [
115]. Achieving detection of stage II cancer is an impressive accomplishment, but the test is likely to miss patients with stage I CRC at the time of testing, which would rule out the advantage of early detection of CRC via screening. Other miRNAs of interest are miR-135a and miR-135ve, which affect APC gene expression and Wnt pathway activity, both of which play a role in the pathogenesis of CRC [
116]. In another study, a panel of six miRNAs was developed for studying CRC recurrence. Three miRNAs were significantly decreased (miR-93, miR-195, and let-7b) and three were significantly increased (miR-7, miR-141, and miR-494) in patients with early relapse and were also associated with decreased survival rates [
117]. Another recent study looked at serum miR-92a-1, which showed a sensitivity of 81.8% and specificity of 95.6% [
118]. This represents great potential; however, this was a relatively small study on 148 patients, and thus, there is not enough evidence yet for it to be used clinically on a wide scale. A different miR has also been described, namely miR-30a-35p, which was shown to be downregulated in patients with CRC with a relatively high value of area on a receiver operating characteristic (ROC) curve, giving it high potential to be used as a screening tool in the future. However, sensitivity and specificity tests have yet to be carried out [
119].
Another group of circulating markers includes other non-coding RNAs, such as long non-coding RNAs (lncRNAs). As an example, lncRNA differentiation antagonizing non-protein coding RNA (DANCR) was upregulated in CRC serum samples, and its level correlated with the clinicopathological features of the CRC patients [
120]. Another example of circulating lncRNA is serum NEAT1, which was identified as an independent prognostic factor for CRC and also as a marker to help differentiate metastatic CRC from non-metastatic CRC [
121]. Other known examples of lncRNAs in CRC were characterized from cancer cells and tissues, e.g., CCAT1, CCAT12, CASC11, CRNDE, GAS5, H19, HOTAIR, PCAT 1, RAMS11, and UCA1. Circular RNAs (circRNAs) represent another relatively stable molecular species detectable in serum. Due to their covalently-closed loop structure, these single-stranded non-coding RNAs are particularly stable and provide potentially useful markers, such as, for example, circ-PNN (hsa_circ_0101802) [
122].
3.4. Differential Gene and Protein Expression in CRC
Another marker research area that has attracted much attention concerns the analysis of transcriptome alterations in CRC to identify differentially expressed genes and proteins. Identification of differentially expressed genes has the potential to reveal molecular markets, both at the mRNA and protein levels, involved in tumor development and progression, as well as markers suitable for cancer detection. Increasing numbers of promising CRC molecular markers and targets are being discovered and reported in the literature, and these can be generally divided into four major categories: (1) markers associated with a poor or favorable prognosis; (2) markers associated with a high relapse rate in CRC; (3) markers for CRC resistance to treatment modalities, and (4) potential targets for treatment. A systematic search of the recent literature yielded over 100 differentially expressed CRC molecular markers and targets, the vast majority of which are overexpressed in CRC, though a smaller number of markers are downregulated. In terms of function, these ~100 genes represent over 1000 various biological pathways, but some are strongly overrepresented in this selection. These include the cell division pathway, pathways representing regulation of gene expression, regulation of cell proliferation, positive regulation of transcription, G-protein coupled receptor signaling, the inflammatory response, signal transduction, and chemokine-mediated signaling, as well as negative regulation of apoptosis. All of the ~80 upregulated genes markers listed in
Supplementary Tables S1 and S2 are potentially suitable for molecular detection of CRC, and in the majority of these genes, association with a poor prognosis has been reported. In addition, many of the overexpressed proteins in CRC have been suggested as potential treatment targets. For the 14 genes and their product proteins reported to be downregulated in CRC, their lower expression levels correlate with a poor prognosis, while a lesser degree of downregulation is linked to a better prognosis (
Tables S2 and S3).
Metastasis and the relapse rate are vital in cancer diagnosis, and any biomarker that could provide an index for these factors would prove highly beneficial. Of particular interest might be stromal cell-derived factor 1 (CXCL12), cyclin-dependent kinases regulatory subunit 2 (CKS2), metalloproteinase inhibitor 1 (TIMP1), centrosomal protein of 55 (CEP55), and guanine nucleotide-binding protein G(I)/G(S)/G(O) subunit gamma-2 (GNG2), for which overexpression has been linked to higher relapse rates [
123,
124,
125,
126,
127,
128,
129,
130]. There are several differentially expressed genes in CRC that are believed to be associated with the epigenetics of DNA and miRNA. For example, the upregulation of SLC10A1, MAPT, SHANK2, PTH1R, and C2, and the downregulation of CAB39, CFLAR, CTSC, THBS1, and TRAPPC3 have been proposed as markers of CRC metastasis in the liver [
131].
Molecular biomarkers may also support quantitative analysis of CRC resistance to treatment modalities. Chemoresistance and radio-resistance reduce the effectiveness of treatment regimens and are difficult to anticipate in patients. Therefore, introducing a biomarker that indicates certain cancer sensitivities and guides the response to treatment is imperative. As an example, protein tyrosine kinase 6 (PTK6) is a protein kinase that in normal cells, functions as a cytoplasmic signal transducer. However, in CRC, the interaction between PTK6 and Janus kinase 2 (Jak2) promotes chemoresistance, and it has been proposed that adding a PTK6 inhibitor to the chemotherapy regimen may improve the chemosensitivity of CRC [
132]. Enoyl-CoA hydratase 1 (ECHS1) is an enzyme that promotes the glycosylation of ceramide, which is believed to be a key step in chemotherapy resistance. Monitoring this marker may assist in the selection of appropriate patients for chemotherapy [
133]. Another marker, N-MYC downstream-regulated gene 1 (NDRG1), a key regulator of a variety of cell growth regulatory processes and signaling pathways, was shown to enhance chemosensitivity by modulating EGFR trafficking in metastatic CRC [
134]. DNA topoisomerase 2-alpha (TOP2A) is a nuclear decatenating enzyme that alters the DNA topology. Alterations to TOP2A expression and mutations are associated with more advanced CRC and alteration of the cancer response to chemoresistance [
135].
Overexpressed CRC proteins may provide convenient targets for CRC treatment. As an example, epiregulin (EREG) is a peptide hormone, a member of the epidermal growth factor (EGF) family. EREG is associated with the demethylation of two promoter locations, which, in turn, leads to upregulation of the EGF receptor’s phosphorylation, resulting in the development of adenocarcinoma. Upregulation of EGF crypt-cell-to-CRC-transformation is one of the steps that occur during the adenoma-carcinoma transition stage [
136]. Yes-associated protein 1 (YAP1) has been found to have the same effect and is one of the main effectors of the Hippo pathway. It is known to have an association with several cancers, including CRCs, and the levels of YAP1 in the cytoplasm of CRC cells are believed to be linked to patient survival. The higher the levels, the poorer the prognosis [
137]. Another recently reported serum marker is angiogenin. It has a sensitivity of 66.2% and specificity of 64.9% at ruling out CRC [
138]. However, the specificity rate remains too low for it to be used as a screening tool for the early detection of CRC.