Lung cancer is an aggressive neoplasm and is the leading cause of cancer-related deaths worldwide, with an estimated 1.8 million deaths
[1]. The five-year survival rate is associated with the stage of the disease—67% for stage I and 23% for stage III—and the mortality is also strongly associated with late diagnosis
[2]. This scenario is aggravated by the absence of a noninvasive screening test, for example, mammography and the fecal occult blood test currently in use for other aggressive neoplasms such as breast cancer and colorectal cancer (survival rate 60–80% respectively). Although low-dose computed tomography (LDCT) has shown a 20% reduction in mortality
[3], its application remains limited to the high-risk population (heavy smokers aged 50–80 years), excluding the growing number of young individuals (<50 years) diagnosed with advanced-stage lung cancer
[4][5]. Furthermore, the prevalence of false positives leading to unnecessary invasive diagnostic procedures, coupled with the high costs of the methodology, renders it unsuitable for integration into screening initiatives in low-income developing countries
[6]. Concerning clinical practice, there is a pressing need for an alternative solution to address the key questions such as noninvasiveness and test reliability while favoring easily obtainable biological samples that can be analyzed with cost-effective tools and reagents, thus making it feasible for adoption even in less industrialized countries. According to the National Institute of Health (NIH), a biomarker is defined as “a characteristic used to measure and evaluate objectively normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention”
[7]. In this regard, during the last decade, a considerable number of research studies have focused on the investigation of new technologies for the identification of biomarkers that should be suitable for mass screening, tackling the complexity of the biological and histological heterogeneity of lung cancer. Several biological molecules such as proteins, microRNAs (miRNAs), circulants tumor cells (CTCs), tumor DNA (ctDNA), and volatile organic compounds (VOCs) have been investigated to understand their predictive value. Another key point of early detection is the issue of sample choice. Body fluids such as blood (serum and plasma), urine, stools, exhaled breath, sputum, and saliva meet clinical needs because of their simplicity of collection and noninvasiveness
[8][9].
Was carried out a comprehensive analysis of the reviews available in the literature , published in the last year using the keywords “ Lung Cancer” AND “early diagnosis” AND “biomarkers”. Following the removal of duplicate and further screening six reviews were selected for the final text. The ultimate goal was to identify which studies are currently in the validation phase and which biomarkers hold future potential as predictive elements for lung cancer.
Circulating Blood Proteins and Autoantibodies
Circulating proteins can stem from various sources, including the overexpression of cancer cells, increased secretion from diseased tissue, or inflammation linked to malignancy. The proteome has been widely studied in the oncological field to identify serum proteins as potential biomarkers for the early diagnosis of lung cancer. Among the most interesting studies conducted in the last 10 years, CancerSEEK reported a panel of eight proteins (CA-125, CEA, HGF, Myeloperoxidase, OPN, Prolactin, and TIMP-1) effective in distinguishing lung cancer patients from healthy controls
[10]. Moreover, the combination with cfDNA increases the sensitivity of this protein panel
[10][11]. In another study, a panel of three proteins and one autoantibody (NY-ESO-1) were assessed, and a sensitivity of 71% and specificity of 88% were observed
[12]. Mazzone et al. performed a separate clinical trial with the same test (PAULA), demonstrating a sensitivity and specificity of 49% and 96%, respectively
[13]. A prospective proteomic study based on two proteins (LG3BP and C163A) integrated with clinical and imaging features showed a sensitivity of 97% and a specificity of 44%
[14]. A more recent project involves the development of a 36-protein multiplex assay for the risk assessment of lung cancer. However, more studies should be conducted to demonstrate that these approaches are suitable to implement in clinical practice
[9]. Cancer cells stimulate the immune system through the release of protein inducing the production of circulating autoantibodies against tumor-associated antigens (TAAs). EarlyCDT, a panel of seven autoantibodies (p53, NY-ESO-1, CAGE, GBU4-5, HuD, MAGEA4, and SOX2), is commercially available as a blood test to assess the risk of malignancy in people with solid pulmonary nodules
[15]. A clinical trial with EarlyCDT on a symptomatic lung cancer patient showed a sensitivity of 41% and a specificity of 91%, and a follow-up study on a high-risk cohort revealed a sensitivity and a specificity of 37% and 91%, respectively
[16]. Moreover, Qiang Du et al. tested p53, PGP9.5, SOX2, GAGE7, GBU4-5, MAGEA1, and CAGE and found no statistically significant difference between stages I/II and III/IV, concluding that the test is capable of detecting both early and advanced stages. This phenomenon could be related to the amplification of the immune system. Further studies will be needed to understand the potential prognostic power of proteins and TAAs
[17][18]. Their stability in the serum allows them to be detected via immunoenzymatic assays (ELISAs) and makes TAAs possible biomarkers for the early diagnosis of lung cancer
[19].
microRNA (miRNAs)
MiRNAs are small noncoding RNAs that are involved as regulators of gene expression at the post-transcriptional level. They can be aberrantly expressed in many pathological processes as well as in cancer. MiRNAs can be detected in different body fluids such as urine, sputum, and blood (serum and plasma)
[20]. In 2002, Calin et al. reported the involvement of microRNAs in lung cancer pathogenesis
[21]. They preserve their stability from initial development to metastasis formation, making them appealing biomarkers for the diagnosis and prognosis of lung cancer
[8][9][20]. An early study conducted on lung tissue detected 12 miRNAs expressed differently between lung cancer tissue and benign lung tissue
[22]. In addition, studies on miRNAs in sputum have shown that the combination of multiple miRNAs can differentiate lung cancer patients from healthy individuals with a sensitivity of 73% to 80% and a specificity of 91% to 96%
[23]. Two further studies have compared different miRNA panels in lung cancer patients before and after lung cancer resection and in healthy controls. Le HB et al. showed an increased expression in the serum of miR-21, miR-205, miR-30d, and miR-24 before lung cancer surgery. The same miRNA was upregulated in the serum of early-stage lung cancer patients in comparison to healthy subjects, suggesting their role as a screening biomarker as well as for postoperative disease relapse
[24]. Moreover, an 18-month postsurgery follow-up conducted by Leidinger et al. demonstrated a significant reduction in the expression levels of miRNA over time after the surgery
[25]. Currently, the miR-Test
[26] and MSC (microRNA signature classifier)
[27] are undergoing validation. The serum signature of miRNA identified in high-risk subjects enrolling in a screening program with LDCT showed a sensitivity and specificity of 77.8% and 74.8%, respectively
[26]. Sozzi et.al. based on 24 miRNA expression ratios stratified the population into low, inter-mediate, or high risk of lung cancer
[28]. Their study revealed 87% sensitivity and 81% specificity. Both studies exhibited a reduction in the LDCT false-positive rate
[26][27][28].
Circulating Tumor Cells (CTCs) and Circulating Tumor DNA (ctDNA)
CTCs are derived from the primary tumor mass. During this process, the cells detached from the tumor mass enter the circulatory stream. CTCs were evaluated in a group of 168 patients with chronic obstructive pulmonary disease (COPD) followed with annual CT scans for 4 years. It was found that COPD patients who tested positive for CTCs in the annual CT screening developed lung nodules 1–4 years later. These studies suggest that CTCs could be used for early diagnosis
[29][30]. Another study showed that the sensitivity and specificity of CTCs for diagnosing lung cancer were 73.2% and 84.1%, respectively
[31], while Wang et al. obtained a sensitivity of 77.7% and a specificity of 89.5%. The comparison between the sensitivity of stage I and stage II revealed that the two values almost overlapped (69.8% and 72.2%)
[32]. A study of a larger lung cancer patient cohort demonstrated sensitivity and specificity values similar to other studies, but with the combination of CEA and additional biomolecules, these values could be increased to 84.21% and 88.78%, respectively
[31]. Emerging research with negative enrichment fluorescence in situ hybridization methods or the FISH approach demonstrated that the sensitivity and specificity were increased (89–100%)
[30][32][33].
ctDNA is a part of cell-free DNA derived from tumor cells. The concentration of ctDNA in plasma varies from 0.01% to 90%
[34]. Newman et al. observed a 100% rate of ctDNA in patients with stage II-IV lung cancer, while a 50% rate was observed in early-stage patients
[35]. The combination with protein showed a specificity of 99% and a sensitivity of 59%
[10][11]. Using deep sequencing (CAPP-seq), Chabon et al. investigated cancer profiling to analyze the ctDNA. This approach demonstrated that ctDNA levels were low in early-stage lung cancer. The same research group developed and validated a machine learning method (Lung-CLiP) using the findings described above in conjunction with other molecular features, and a specificity of 96% was achieved
[36]. Phomaryova demonstrated that in lung cancer patients, the concentration of ctDNA is eight times higher than that in healthy individuals
[37]. Furthermore, studies report that high concentrations of circulating ctDNA are correlated with a worse clinical outcome
[33]. However, ctDNA has demonstrated poor sensitivity, and most patients have levels of less than 0.1%, which is challenging to detect in the blood
[9].
Future Directions and Challenges: Volatile Organic Compounds (VOCs)
Since the 1970s, volatile organic compounds have been used in the field of medicine
[38]. Lung cancer studies emphasize the presence of VOCs in exhaled breath
[39]. The most widely used approach for the analysis of respiratory VOCs is gas chromatography combined with mass spectrometry (GC/MS)
[40]. This method has shown a discriminatory power to detect the specific volatile compounds of lung cancer patients. In one study, GC/MS combined with artificial neural networks showed a sensitivity of 80% and specificity of 91%
[41]. In a prospective pilot study, Peled et al. demonstrated the potential of breath analysis to distinguish malignant nodules from benign nodules in high-risk subjects
[42]. Another promising measurement device in the field of early diagnosis is the electronic nose (e-nose). This emergent technology is based on the binding of VOCs to different sensors or sensor arrays within handheld devices. The investigators analyzed 214 breath samples using an e-nose with 11 gas sensors. The experimental results revealed an accuracy of 95.75%, a sensitivity of 94.78%, and a specificity of 96.96%
[43]. Shlomi D et al. compared patients with benign lung nodules and patients with lung cancer. Moreover, the lung cancer group was divided into two subgroups: patients who harbored the EGFR mutation and lung cancer patients with wild-type EGFR. This study showed the discriminatory power to distinguish the early LC from benign nodules and had 87% accuracy
[44]. Two other studies used an e-nose to detect a specific lung cancer signature (in lung cancer patients vs. high-risk healthy controls) with a sensitivity of 81% and specificity of 91%
[45]. Moreover, Gasparri et al. demonstrated that an e-nose with 12 sensors has a greater sensitivity to lung cancer at stage I with respect to stage II/III/IV (92% and 58%, respectively)
[46]. Additionally, a recent multicentric case–control study yielded a sensitivity of 95% and a specificity of 49%
[47].
So far, more than 100 volatile urinary biomarkers have been suggested as being related to cancer. Urinary VOC patterns in cancer patients are often different from those found in the urine samples of control subjects, and these differences also depend on cancer type and stage
[48]. In 2023, investigators isolated for the first time five specific VOCs of early-stage lung cancer (I/II) with a specificity and sensitivity of 85% and 90%, respectively
[49]. Results with greater robustness are warranted before these may be fully integrated into workflows or incorporated into clinical guidelines.
All suitable biomarkers are shown in
Table 1.
Table 1.
Selected lung cancer biomarkers.
Study |
Population |
Method |
Biomarkers |
Main Results |
Xu BJ [10] |
40 LC 8 HR |
MALDI-MS |
Proteins |
75% accuracy |
Doseeva V [12] |
75 LC 75 HR |
IMMUNOASSAY xMAP |
Proteins and autoantibody |
77% sensitivity 80% specificity |
Mazzone PJ [13] |
155 LC 245 HR |
IMMUNOASSAY MAGPIX |
Proteins and autoantibody |
74% sensitivity 80% specificity |
Silvestri GA [14] |
29 LC 149 HR |
MS |
Proteins |
97% sensitivity 44% specificity |
Chapman CJ [16] |
235 LC 266 HR |
ELISA |
Autoantibodies |
92 % accuracy |
Du Q [17] |
305 LC 74 HR |
ELISA |
Autoantibodies |
56.53% sensitivity 91.60% specificity |
Yu L [23] |
64 LC 58 HR |
qRT-PCR |
miRNA |
80.6% sensitivity 91.7% specificity |
Montani F [26] |
74 LC 115 HR |
NA |
miRNA |
77.8% sensitivity 74.8% specificity |
Sozzi G [28] |
69 LC 870 HR |
PCR |
miRNA |
87% sensitivity 81% specificity |
Yu Y [31] |
153 LC 93 H |
RT-PCR + FISH |
CTCs |
67.2% sensitivity for stage I 84.1% specificity |
Katz RL [32] |
107 LC 100 H |
FISH |
CTCs |
89% sensitivity 100% specificity |
Newman AM [35] |
13LC 13 H |
CAPP-Seq |
ctDNA |
96% specificity |
Ponomaryova AA [37] |
60 LC 32 H |
TaqMan PCR (MSP) |
cirDNA |
87% sensitivity 75% specificity |
Rudnicka J [41] |
86 LC 41 H |
GC/MS |
VOCs |
80% sensitivity 91.23% specificity |
Shlomi D [44] |
89 LC 30 H |
eNOSE |
VOCS |
83% accuracy 79% sensitivity 85% specificity |
McWilliams A [45] |
25 LC 166 H |
eNOSE |
VOCs |
80% accuracy |
Gasparri R [46] |
70 LC 76 H |
eNOSE |
VOCs |
81% sensitivity 91% specificity |
Hanai Y [48] |
20 LC 20 H |
GC/MS |
VOCs |
95% sensitivity 70–100% specificity |
Gasparri R [49] |
46 LC 81 H |
GC/MS |
VOCs |
85% sensitivity 90% specificity |