3. A Roadmap to the Successful Development of Blood-Based Biomarkers for Lung Cancer Early Detection
The bottleneck for the successful translation of biomarkers to the clinical use generally lies in the suboptimal standardization in each step of the biomarker pipeline, including discovery, prioritization, and clinical validation. We prepared a summary of the main issues and the best practices in biomarker development (Figure 1
B). The first fundamental step in biomarker discovery is establishing a high-quality design which includes making explicit hypotheses on the potential application/integration into current recommended screening programs as well as adopting enrollment protocols with clear inclusion and exclusion criteria for patients and controls. Moreover, heterogeneity (epidemiological, biological and molecular) needs to be considered as the driver for adequate sample size to fulfill the best design. Indeed, published studies often lack acceptable sample size with respect to the numerous phenotypic features that should be considered to widely represent the screening population 
, and the number of variables that should be analyzed to deconvolute the high level of genetic heterogeneity of lung cancer. To limit self-selection bias, instead of convenience selection of subjects (based on easy availability of the sample) 
, control populations should be identified based on matching criteria with the patients’ cohort, and extensively represent the actual incidence and prevalence of lung cancer in the screening population.
In the absence of standards for handling specimens (collection, storage and processing) and controls for pre-analytical factors, randomization and blinding should be applied to reduce bias from the experimental analysis. Indeed, quality and reproducibility of biomarkers can be influenced by uncontrolled pre-analytical conditions (i.e., fasting, lipemia, partial hemolysis 
) and by sample collection bias, especially when the biomarker is labile or sensitive to temperature fluctuation or handling conditions (i.e., type of collection tubes, centrifugation steps, long-term or short-term storage, freeze/thaw cycles; 
). We therefore suggest performing initial pilot experiments to measure the stability of circulating biomarkers, i.e.: (i) by testing different samples collection strategies, using different collection tubes for serum or plasma collection 
; (ii) quantifying how much hemolysis (partial or hidden) can influence biomarker concentration 
, (iii) checking if analyte concentration is influenced by fasting status 
, and (iv) testing if different storage conditions (short-term vs. long-term; +4 or −20/−80 °C or liquid nitrogen) can alter biomarker quantity and quality 
. After such analyses, a standard operating procedure (SOP) for sample collection and handling should be defined and rigorously applied to the specific biomarkers screening study.
Nowadays, high-throughput data allow the identification of many biomarkers acting jointly on the risk of lung cancer; these markers can be easily combined in a single multivariable statistical model; moreover, to avoid the resulting possible overfitting (i.e., capturing noise instead of the true underlying data structure), machine learning approaches with sample-splitting or cross-validation should be considered 
. The performance of a new biomarker for the early detection of cancer is easily measured by true-positive and false-positive rates, and summarized through receiver operating characteristic curves (ROC). However, the “average” performance is often presented in the literature, with ROC calculated across all study subjects, while subgroup and/or multivariable analysis should better reveal the utility of biomarker testing in specific groups (i.e., tumor stages, nodule density, histotypes).
Exploration of biomarkers’ performance in subgroups could also help with ranking the selected candidates for clinical relevance. Moreover, when a new biomarker study is published, only limited discussion on the biological function of the candidates is reported, and assay/platform reproducibility and standardization are frequently lacking (see below). In our experience, an in-depth analysis of technical and biological variables which might have an impact on the detection and quantification of selected biomarkers should also be performed. For example, uncontrolled environmental conditions during sample processing could influence the quantification of biomarkers of interest. Marzi et al. 
showed, by using an automated purification system based on spin columns for nucleic acid purification, that efficiency in miRNA extraction was inversely proportional to temperature increase during daily runs. Similar findings were also described by other research groups 
In the case of analysis of multiple biomarkers (e.g., DNA, RNA and protein), the collected samples (whole blood, plasma, serum) can be split in several aliquots which can be differently prioritized for processing based on stability of the biomarkers of interest; in case of RNA, which is more liable, the relevant sample aliquot can be processed immediately while other aliquots (for other biomarker types) can be processed subsequently. Likewise, the use of different extraction kits with or without additional centrifugation steps could affect quantities and species of the biomarkers of interest. Cheng et al. 
showed that plasma samples can be contaminated by residual platelets, which impact most miRNA measurements (~70%), therefore authors suggested to add pre- or post-storage centrifugation steps in order to remove residual platelet contamination. Furthermore, miRNA quantities may vary depending on the kit used for extraction 
To keep track of the impact of these pre-analytical and analytical variables, we strongly recommend using endogenous and exogenous controls. In circulating miRNA, biomarker analysis measuring both endogenous controls (e.g., RNU6, RNU44, miR-16 
) and exogenous controls, e.g., synthetic miRNAs from other organisms (ath-miR-159a and/or cel-miR-39), allows monitoring sample degradation, extraction efficiency and performance of miRNA detection by using different screening platforms (e.g., qRT-PCR, ddPCR, microarray, NGS).
Lastly, the analytical translation in a clinically applicable platform and validation in a large prospective trial are both needed to complete validation of candidate biomarkers. Industrial and clinical partners could facilitate these phases, providing funding supports and know-how in large-scale test production, regulatory affairs and commercialization 
. A major issue in the validation of biomarkers for lung cancer early detection is to prove its benefit in the context of screening programs, where lead- and length-time biases and overdiagnosis are peculiar. Therefore, the choice of the end-point is essential and, although biases could occur in interpreting causes of death, lung-cancer mortality reduction should represent the primary endpoint 
, then followed by the evaluation of overall mortality.