3.2. Looking towards New EMT Biomarkers in Primary Tumors by Using Proteomics
Although in vitro models are useful for exploring specific mechanisms that drive or block the EMT process, they usually lack natural intratumor and intertumor heterogeneity otherwise observed in real cancers. In addition, representing the interaction between cancer cells and non-cancer cells within the tumor microenvironment is a complex task. Therefore, characterizing the proteome of only one cell type may also mislead the interpretation of its real significance. In this context, the best approach to understanding the contribution of the EMT to the progression of real tumors may demand a careful analysis of the proteome of tumor samples. Still, as confounding factors might be more difficult to isolate in this scenario than in vitro, additional considerations should be kept in mind, such as the existence of distinctive molecular subtypes and clinicopathological characteristics for the patients included in the study.
Considering that most proteomic techniques are very expensive and time-consuming, studies aiming to evaluate the proteome of human cancers would hardly be designed to exclusively evaluate EMT-related effectors or biomarkers. Otherwise, such proteins may eventually emerge among the set of differentially expressed molecules, particularly considering their well-established relevance in cancer invasion and metastasis, as discussed before. For instance, Moreira et al. (2004) have observed this pattern of differentially expressed proteins when comparing the proteome of bladder specimens derived from normal tissues and transitional cell carcinomas (TCCs)
[97][34]. Sixty percent of the tumors analyzed expressed high levels of Vimentin and PGP9.5 [also known as Ubiquitin C-terminal hydrolase L1 (UCHL1)], suggesting the presence of cancer cells undergoing EMT
[97][34]. Additionally, invasive tumors showed lower levels of the epithelial protein 14-3-3σ (also known as Stratifin) than normal tissues and non-invasive tumors
[97][34]. Interestingly, the evaluation of tumors exhibiting heterogeneous staining for 14-3-3σ demonstrated the progressive loss of its expression, with noteworthy negative staining in invasive areas
[97][34]. Similarly, Sun and colleagues reported that the mesenchymal marker Vimentin was consistently overexpressed in hepatocellular carcinomas (HCCs) compared with cirrhotic and normal liver tissues, reinforcing the association between EMT and cancer progression
[98][35].
As discussed before for in vitro studies, the presence of highly expressed proteins can mask the presence of less abundant molecules able to play critical roles in the biological process evaluated. Although sub-proteomes of human tumors have not been frequently evaluated, some examples highlight their relevance. The analysis of the cytosolic fraction of BCs, for example, associated higher levels of Ferritin light chain (FTL) with reduced metastasis-free survival
[104,105][36][37]. Interestingly, histological analyses revealed that FTL was mostly expressed by stromal cells and its levels were correlated with the expression of CD138 (also known as Syndecan)
[104][36]. Because CD138 is a typical mesenchymal marker, its expression in both stromal cells and cancer cells suggested the occurrence of EMT in at least part of these breast cancers
[104][36].
Altogether, these studies confirm that many EMT-related proteins identified in vitro show potential use as biomarkers for different types of cancer, being associated with cancer recurrence, lymph node metastasis, and distant metastasis. Moreover, while important EMT-related alterations have been characterized in vitro, many proteins are regulated by the interactions between cancer cells and stromal cells. This observation reinforces the importance of integrative studies analyzing the proteome of immortalized cancer cells and human cancer samples. Still, because analyzing proteins from tumor samples demands highly invasive approaches to obtain biopsies and resected samples, this may be a problem when monitoring cancer patients before/after treatment. In the next section, studies focused on overcoming this limitation by investigating and establishing biomarkers in biological fluids are discussed.
3.3. Biological Fluids: An Easier Access to EMT-Related Biomarkers
The analysis of tumor samples, particularly their proteomes, requires invasive procedures to obtain enough tissue to detect low-abundance proteins. Otherwise, the evaluation of biomarkers in biological fluids such as blood, saliva, and urine requires less invasive methodologies that enable a closer and easier follow-up of the patients. Moreover, analysis of patient-derived data obtained over time (e.g., before, during, and after therapeutic intervention) may help to more accurately characterize the dynamic networks that regulate such a fluid process as the EMT.
Aiming to establish biomarkers for bladder cancers, research led by Celis
[106][38] and Ostergaard
[107][39] reported increased levels of Psoriasin (also known as S100A7) in squamous cell carcinomas (SCCs) and urine samples from cancer patients. Interestingly, Psoriasin was not observed among the serum proteins of SCC patients, indicating its specific use as a biomarker to be screened in the urine of these patients
[106,107][38][39]. In another example, Sun and colleagues reported Vimentin as a sensitive and specific biomarker when analyzing the proteome of serum samples from HCC patients and non-neoplastic controls
[98][35]. In this study, Vimentin levels were used to distinguish even patients with small HCCs (<2 cm) from non-neoplastic controls
[98][35].
Whereas the detection of cancer proteins in biological fluids shows clear benefits, it also incurs a methodological issue associated with the presence of highly expressed proteins that may mask less abundant molecules. To improve the detection of relevant molecules, particularly from serum and plasma samples, the depletion of excessively abundant proteins is recommended. In BC patients, proteomic analysis of albumin-depleted serum samples demonstrated the association between LN metastasis and several proteins related to the cytoskeleton and ECM structure or remodeling, including collagen α4I, Serpin C1, Fibrinogen gamma chain (FGG), and Tenascin XB (TNXB)
[108][40]. Moreover, in this study, TNXB was detected in the serum samples of all patients diagnosed with benign breast diseases and LN-negative cancer patients, but not in LN-positive patients, indicating that loss of circulating TNXB could be used as a biomarker of LN metastasis
[108][40].
Several cancer types lack specific diagnostic or prognostic biomarkers. Even for cancers better characterized (e.g., breast, colorectal, and lung cancers), only a small number of biomarkers exist, and their use is restricted by reduced sensitivity and specificity. Moreover, many different conditions with clinical relevance cannot be determined by current biomarkers, including the progression toward resistance to anti-cancer therapies and the development of locoregional and distant recurrence. As observed by the studies discussed here, EMT-related proteins could be used as cancer biomarkers, particularly if considering their typical association with cancer progression and metastasis. Still, the reduced number of publications evaluating cohorts specifically grouped according to EMT-associated outcomes limits the generalization of the conclusions obtained. For instance, additional proteome-based studies including patients who progressed to LN metastasis or distant metastasis, may confirm the reliability of EMT-related proteins for this purpose. Investigations focused on recurrence and resistance to therapy have also been overlooked, and cancer types with lower incidence have been often ignored in this kind of evaluation. In addition to an experimental design focused on EMT-associated modifications, improved technologies may allow the characterization of cancer protein profiles, as has been currently performed for the characterization of cancer transcriptomes.
4. Integration of Multiomics and Spatio-Temporal Analyses for a Comprehensive Understanding of EMT-Driven Cancer Progression
In contrast to studying the changes at the DNA/RNA level, current proteomic methods cannot satisfactorily cover the entire diversity of proteins within cancer cells or the tumor mass. Additionally, evaluating the protein profile of multiple samples is highly expensive and time-consuming. These and other reasons led to a better understanding of how modifications in the epigenome and transcriptome impact biological processes while depicting global changes in protein levels was left behind. But to what extent should we rely on a single omics when translating EMT-related findings to the clinic? Although several studies individually demonstrated global alterations either in the epigenome, the transcriptome, or the proteome of cells undergoing EMT, the direct comparison of these results is still an issue. Besides analyzing different cancer types, in vitro studies frequently induce EMT by exploring only one EMT inducer at a time. In addition to differences across studies, this strategy limits the investigation of possible crosstalk between multiple signaling pathways. Therefore, multiomics may provide a more reliable source for the comparison of the many regulatory mechanisms impacting cancer cells at multiple levels during EMT—particularly when analyzing cancer patient samples that are inherently affected by several EMT regulators.
In fact, seminal studies published throughout the last decade have already begun to adopt multiomics as an approach to analyze alterations in cancer samples that simultaneously impact DNA, RNA, and protein levels. Although most of these studies investigated different cancer types, some similarities are worth mentioning. Among them is the common divergence between transcriptome- or proteome-derived data, such as that reported in colorectal
[109][41], breast
[110,111][42][43], ovarian
[112][44], gastric
[113][45], and lung
[114][46] cancers. Besides methodological parameters influencing this correlation, spatio-temporal changes in RNA species largely affect their localization and availability for translation, thus, impacting protein levels
[115,116,117,118][47][48][49][50]. Importantly, this effect is reported to change in pathological conditions, being increased in cancers when compared with normal tissues, and particularly enhanced with cancer progression
[115][47]. Moreover, copy number alterations (CNAs) and post-translational modifications are also shown to significantly impact gene expression in a way that is not necessarily translated into protein modifications
[110,114][42][46]. In addition to reinforcing an important difference in the mechanisms that impact cancers at the molecular level, this analytical divergence has a profound impact on the ability to determine patient prognosis. Remarkably, depending on the method of choice, patients showing significantly different probabilities of survival cannot be distinguished by such omic analysis. For example, in CRCs, only a proteome-based clustering—but not other types of analysis—revealed a typical EMT signature correlated with poor prognosis
[109][41]. Furthermore, the identification of a molecular subtype associated with cell invasion and poor survival rate in early-onset gastric cancers (EOGCs) required an integrative clustering using global mRNA, proteome, phosphoproteome, and N-glycoproteome
[113][45]. This observation reinforces a critical problem as the use of single omic methods may not suffice to accurately describe the myriad of alterations within a tumor, therefore, representing an obvious issue in determining therapeutic approaches.
Although the integration of multiple omics helps to overcome limitations otherwise imposed by the individual use of each approach, spatial alterations are often masked in bulk analysis, and detecting temporal modifications remains unfeasible—especially considering a clinical context. As mentioned before, the increasing association of deconvolution strategies and transcriptomic-focused methods with single-cell resolution has been instrumental in depicting the contribution of different tumor compartments regarding EMT-related alterations. Tissue microdissection also partially helps to overcome this limitation to individually investigate molecular signatures associated with either tumor epithelium or stroma. For instance, in microdissected prostate cancers, a gradual decline in phosphorylated (p-) Mitogen-activated protein kinase (ERK) levels and concurrent increase in p-AKT levels have been associated with cancer progression
[119][51]. Similar results were also reported in microdissected CRCs, where decreased p-ERK and p-p38 levels were observed in cancer tissues compared with uninvolved mucosa
[120][52]. Thus, integrating tissue microdissection and tissue microarray may increase the understanding of spatial modifications otherwise overlooked by the analysis of bulk samples.
Further, a combined approach can also help to characterize alterations in rare samples that are not located within the cancer mass but have been shed by the tumor and may be scattered throughout the body, such as CTCs. For instance, in a study simulating CTCs by spiking immortalized cancer cells into blood samples, 4000 proteins were identified by one-dimensional high-resolution porous layer open tube-liquid chromatography (LC)-MS in samples spiked with 100–200 MCF7 breast cancer cells
[121][53]. Impressive results were also described by using nano-LC-MS for the analysis of 1–5 LNCaP prostate cancer cells spiked and recovered from blood samples
[122][54]. In HNSCC patient samples, the use of mass cytometry and unsupervised clustering allowed the identification of epithelial and EMT sub-groups of CTCs, where the latter accounted for more than 80% of all CTCs
[123][55]. Interestingly, the expression of immune checkpoint proteins (e.g., PD-L1 and CTLA4) was lower in CTCs with an EMT phenotype when compared with epithelial counterparts
[123][55]. Further, analysis of molecular pathway activity revealed that CTCs expressing EMT traits were also enriched in p-CREB and p-ERK proteins, but showed reduced levels of other intracellular effectors, such as p-STAT3, p-STAT5, p-PARP, and p-AKT
[123][55]. Establishing the profile and significance of immune checkpoints and intracellular effectors in CTCs may have a profound impact on the development of therapeutic strategies focused on overcoming metastatic progression and resistance to therapy. Noteworthy, while methodological improvement is still required to analyze the proteome of patient CTCs, innovative studies have already begun to characterize the genome, transcriptome, and metabolome of these shedded cells
[124,125,126,127,128,129,130,131,132,133,134][56][57][58][59][60][61][62][63][64][65][66]. Such analyses have not only helped to elucidate how mutations and molecular programs impact the dissemination of cancer cells but have also validated the perspective of employing a multiomic strategy to improve our knowledge of cancer progression.
As for CTCs, few studies have comprehensively investigated the protein profile of EVs isolated from cancer patient biofluids. Nevertheless, initial studies in circulating EVs have already demonstrated an association between HCC progression and increased Galectin-3-binding protein (LG3BP) levels
[135][67]. In CRC, Transferrin receptor protein 1 (TFR1) was reported to be enriched in circulating EVs from non-metastatic patients
[136][68]. In BC, the establishment of protein signatures for circulating EVs (including EGFR, p-cadherin, and fibronectin) enabled differentiating cancer patients from healthy subjects and was further associated with cancer progression, and relapse
[137][69]. Moreover, while the isolation and characterization of EVs from biofluids remains challenging, the development of microfluidic devices for the isolation and enrichment of such membranous particles brings interesting possibilities for the diagnosis and monitoring of cancer patients. For example, it has been recently reported that the analysis of epithelial and mesenchymal markers on plasma EVs captured by microfluidic devices can be successfully used to establish the prognosis of patients with pancreatic cystic lesions
[138][70]. This strategy is particularly important as it uses tumor-derived EVs to monitor EMT dynamics in pancreatic cells and further inform on whether these patients may or may not undergo surgery
[138][70]. Similarly, quantification of the EMT markers in melanoma-EVs through microfluidic devices has been described as an innovative strategy to monitor disease progression. In this context, increased levels of mesenchymal markers (N-cadherin and ABCB5) compared to epithelial markers (E-cadherin and THBS1) characterized a shift in the serum EVs of melanoma patients that was also correlated with the development of drug resistance
[139][71]. Overall, although limited in number, the significance of these studies is remarkable and may be increased if combined with those where DNA and RNA species transported by cancer patient EVs are analyzed and also associated with diagnostic or prognostic potential
[140,141,142][72][73][74]. Furthermore, since EVs and CTCs can both be obtained from blood samples and separated based on physicochemical properties, new methods are emerging to optimize their sequential isolation and analysis in a parallelized multidimensional analytic framework
[143,144][75][76]. Importantly, such methods must not be understood simplistically as additional strategies for the discovery of EMT-related biomarkers. Rather, innovations improving the analysis of rare samples in liquid biopsies (e.g., CTCs and EVs) are paramount to generate a holistic view of the signaling pathways underlying EMT while also providing information on its dynamic regulation during metastasis. In this scenario, biomarkers emerge from an in-depth understanding of the molecular machinery that drives disease progression. Consequently, the translation of such biomarkers into clinical practice will improve existing diagnostic and monitoring methods due to enhanced specificity and sensitivity.