1. Introduction
Oral cancer (OC) is a blanket term used to describe any cancer occurring in the oral cavity. In 2018, more than 350,000 new cases of OC and 170,000 deaths were recorded worldwide
[1]. Tobacco usage, alcohol consumption, and human papillomavirus infection are the major risk factors for OC
[2][3][4]. A recent study compared the incidence of OC in the 10 most populous countries over the past 30 years and reported declining trends in the annual age-standardized incidence rate of OC in Bangladesh, Brazil, Mexico, and the United States; however, increasing trends were observed in China, Indonesia, Pakistan, India, and Japan
[5]. The 5-year overall survival rate of OC is approximately 50%
[6]. To improve the prognosis and quality of life of patients, early detection of OC is essential
[7].
The underlying epigenetic mechanisms and major risk factors of OC vary across countries. India accounts for one-third of the global OC cases, with 77,000 new OC cases and 52,000 related deaths annually
[8]. Tobacco consumption is the main etiological factor. Most OC cases are diagnosed at advanced stages owing to delays in reporting to healthcare professionals
[9]. Approximately 60–80% of the patients with OC diagnosed at late-stage Early detection can improve treatment efficacy and prognosis. Although various methods are available for screening, visual examination is the most commonly used owing to its low cost
[10]. However, diagnosing lesions in the initial stage and differentiating them from inflammatory conditions remain challenging.
Despite the declining trend of tobacco use in Japan, the incidence of OC has increased
[11]. Similar to the global trend, many patients are diagnosed with late-stage OC. In Japan, nationwide screening of five cancers (gastric, colon, and lung cancers for both sexes and breast and cervical cancers for women) is conducted annually or every other year. Insufficient screening is among the reasons for the increasing trend of OC. Therefore, the development of a new cost-effective screening system for OC is necessary.
Saliva is a mixture of biofluids and plays vital roles in oral homeostasis. Other functions of saliva include lubrication, digestion, buffering, taste, tooth protection, and immune defense by protecting against bacteria, viruses, and fungi. Saliva consists of various cellular and molecular components, such as transudate of the oral mucosa, desquamated oral epithelial cells, blood cells, oral bacteria, proteins, metabolites, and inorganic ions (Figure 1). Furthermore, it is mainly secreted from three major salivary glands (parotid, submandibular, and sublingual glands) and other minor glands. It also contains various components which originate from other sources, such as gingival crevicular fluid. Overall, these components make saliva an ideal biofluid for detecting various diseases.
There are several advantages of using saliva for cancer detection. First, a positive correlation has been reported between salivary and plasma metabolite levels, such as those of glucose, pyruvate, and lactate
[12][13], indicating that salivary metabolites provide biological information. Second, saliva is the most readily available biofluid, and its collection requires minimal training
[14]. Third, analysis of saliva samples is convenient owing to the noninfectious collection process, easy transportation, and disposable nature
[15]. Fourth, the saliva metabolite profile of each individual is affected by diet compared to that of urine collected from identical individuals
[16]. Therefore, several cancer biomarkers have been identified using salivary omics technologies.
2. Metabolite Measurement Technologies
The word metabolomics was coined by merging two terms—omics and metabolites. Therefore, it is expected to refer to an analytical method that measures all metabolites. However, no single method can be used to analyze all metabolites because of the large diversity of chemical structures of metabolites in biological samples
[17]. Therefore, various methods have been developed, and each technique has its own advantages and disadvantages. Various metabolite separation and detection systems have also been used to analyze metabolites in saliva samples.
Nuclear magnetic resonance (NMR) is the most frequently used method
[18]. Compared to mass spectrometry (MS), NMR has higher reproducibility and minimal preparation for any sample type
[19]. Pretreatment of the saliva, a viscous liquid, is also a simple process
[20]. This feature is a definitive advantage as it minimizes the chances of causing unexpected errors. NMR has enabled identification of pattern changes in salivary metabolomic profiles, i.e., metabolic signature, to distinguish between patients with cancer and HCs. Some applications of salivary metabolomics explored using NMR include the detection of OSCC
[21][22], head and neck squamous cell carcinoma
[23], and glioblastoma
[24]. In addition to cancer, hepatitis B infection
[25], Parkinson’s disease
[26], and Alzheimer’s disease
[27] have been analyzed.
MS is another a major metabolite detection system with high sensitivity. It consumes a small volume of samples and enables the identification and quantification of hundreds of metabolites simultaneously
[19]. However, direct injection to MS cannot separate the metabolites with the same
m/z (mass divided by charge number) value, such as leucine and isoleucine; therefore, a separation system is usually used before MS. GC-MS allows the quantification volatile compounds and the profiling of non-volatile metabolites by derivatization, which was used to analyze OSCC samples
[28]. Liquid chromatography (LC)-MS has been used for both non-targeted and targeted analyses of salivary metabolites. For non-targeted analyses, hydrophilic metabolites, such as γ-aminobutyric acid, phenylalanine, valine, and lactic acid, of saliva samples of OC patients were analyzed
[21]. A wide variety of metabolites, such as oligopeptides, phosphatidylcholine, and glycerophospholipids, were analyzed in the saliva samples of patients with breast cancer
[29][30]. For targeted analyses, salivary OSCC biomarkers, such as choline, betaine, pipecolinic acid, and carnitine, were quantified
[31]. Salivary polyamines were also analyzed as known biomarkers for breast cancer
[32]. Capillary electrophoresis (CE)-MS was used for hydrophilic metabolite profiling of saliva samples of OC
[33][34][35][36][37], breast cancer
[38], and pancreatic cancer (PC)
[39].
Comparisons of NMR and MS for analyzing saliva samples for OC biomarker discoveries have been previously conducted
[40][41]. Both reviews claimed the necessity of standardization of sample collection and the processing of measuring data. The simultaneous use of NMR and LC-MS to analyze salivary metabolites succeeded in the coverage expansion of the observed metabolite
[42], enhancing the opportunity to find biomarkers related to the focused phenotype.
3. Discrimination Methods
To identify biomarkers, conventional univariate analyses, such as the Student’s
t-test and the Mann–Whitney test for two-group comparisons, have been used in previous studies. Additionally, multivariate analyses were frequently used to analyze the similarity and the difference of overall metabolite profiles. As unsupervised methods, principal component analysis (PCA) and hierarchical clustering analysis have been performed. For example, PCA was used to assess the relative strength of the effects of multiple factors, such as inter and intraday variations, on salivary metabolomics
[43]. Clustering was used to find new subgroups of a disease group based on the observed metabolomic profiles
[44]. These methods help find outliers, assess the quality of samples, and form the groups used in subsequent analyses.
To discriminate a disease group from other groups, such as OC from HC, a combination of multiple metabolite concentration patterns was used
(Table 1). MLR is one of the conventional multivariable methods. It uses minimal independent metabolite sets by eliminating multicollinearity problems
[34][36][45]. Lasso regression model solved the multicollinearity problem
[46]. Partial least squares-discriminant analysis (PLS-DA) is also frequently used to discriminate against multiple groups, enabling the ranking of the metabolite’s contribution to the discrimination. For example, discrimination among OSCC, OLK, and HC was conducted using this method
[21]. Random forest, a classification machine learning (ML) model that leverages multiple decision trees
[47], and alternative decision trees
[38] have also been used to discriminate a group from the others. The most important concern is rigorous validation of the generalization ability to eliminate overfitting. Cross-validation is commonly used in a single cohort, while accuracy evaluation using an independent cohort provides a more rigorous validation
[46].
4. Standard Operating Protocols (SoP)
The discovery and validation of biomarkers are the initial steps to establish new screening methods. A standard protocol for working with saliva samples should be determined to enhance reproducibility. The methods for preconditioning, sample collection, storage, preprocessing, measurement, and data analysis should be standardized
[48]. The effect of inter-day and intra-day variations on salivary metabolomics has been analyzed
[43], and no significant salivary flow was observed in the comparisons. The stimulated saliva showed larger variations in metabolomic profiles than the unstimulated saliva. The time period between the last diet and sample collection also affected salivary metabolomic profiles
[37]. As expected, longer fasting conditions before sample collection improved the discrimination ability of the OC biomarkers. Normalization of overall concentration with the total contents of amino acids decreased the variations due to fasting conditions. Stimulation of the oral cavity, for example, using tobacco and mouthwash, also affected the final results
[49], with stimulated and unstimulated saliva having different metabolomic profiles
[50]. Taken together, the longer the fasting period, the more consistent the use of stimulated or unstimulated saliva samples. Thus, a restriction affecting the oral cavity should be defined as part of the SoP.
The effect of storage conditions on the quantified concentration of metabolite biomarkers was also analyzed
[31]. Variations between short-term storage at room temperature (up to 24 h) and long-term storage at −35 °C (up to 1 month) of four OC biomarkers, such as choline and betaine, were detected. Storage and preprocessing also affected the polyamine profiles in saliva
[51]. The artificially generated noise according to the maximum variations observed during storage and preprocessing enabled the estimation of possible deterioration of discrimination abilities of the biomarkers. Such analyses would assit in the stablishment of SoP in the clinical settings.
Because it is not necessarily limited to salivary metabolomics, MS-based metabolomics required better quality control than NMR-based ones
[19]. Therefore, quality assessment for each processing step and a control method were developed to normalize the quantified data to ultimately eliminate unexpected bias in multiple batch measurements
[52]. Along with these standardizations, the development of an automatic pipeline is also a reasonable approach
[46]. The establishment of a rigorous protocol will likely yield reproducible results; however, it may hinder the widespread use of salivary tests (
Figure 2).