Nasopharyngeal carcinoma (NPC) is one of the most common malignant tumours of the head and neck, and improving the efficiency of its diagnosis and treatment strategies is an important goal. With the development of the combination of artificial intelligence (AI) technology and medical imaging in recent years, an increasing number of studies have been conducted on image analysis of NPC using AI tools, especially radiomics and artificial neural network methods.
1. Introduction
Nasopharyngeal carcinoma (NPC) is an epithelial carcinoma arising from the nasopharyngeal mucosal lining
[1]. According to data from the International Agency for Research on Cancer, the number of new cases of NPC in 2020 was 133,354, of which 46.9% were diagnosed in China, showing an extremely uneven geographical distribution
[2] (
Figure 1).
Radiotherapy for early NPC and concurrent chemoradiotherapy for advanced NPC are recommended by the National Comprehensive Cancer Network
[3]. Optimum imaging is crucial for staging and radiotherapy planning for NPC
[4]. There are various general image inspections for NPC, including computed tomography (CT), magnetic resonance imaging (MRI), and electronic endoscopy. Compared with CT, MRI is the preferred method for primary tumour delineation because of its high resolution on soft tissue
[5][6].
Figure 1. Estimated age-standardised incidence rates (World) in 2020, nasopharynx, both sexes, all ages. ARS: age-standardised rates. Data source: GLOBOCAN 2020. Graph production: IARC (
http://gco.iarc.fr/today, accessed on 31 March 2021.) World Health
[2].
In recent years, artificial intelligence (AI) has been rapidly integrated into the field of medicine, especially into medical imaging. In the most recent review published in 2019, 18 research questions on NPC that remain to be answered were proposed, and two of them were about AI and NPC: ‘How can reliable radiomics models for improving decision support in NPC be developed?’ and ‘How can artificial intelligence automation for NPC treatment decisions (radiotherapy planning, chemotherapy timing, and regimens) be applied?’. Subsequently, many articles in this area have emerged, and a large number of studies have reported on tumour detection, image segmentation, prognosis prediction, and chemotherapy efficacy prediction in NPC. In these studies, radiomics and deep learning (DL) have gradually become the most important AI tools.
2. Pipeline of Radiomics
Radiomics, which was first proposed by Lambin in 2012 [7], is a relatively ‘young’ concept and is considered a natural extension of computer-aided diagnosis and detection systems [8]. It converts imaging data into a high-dimensional mineable feature space using a large number of automatically extracted data-characterization algorithms to reveal tumour features that may not be recognized by the naked eye and to quantitatively describe the tumour phenotype [9][10][11][12]. These extracted features are called radiomic features and include first-order statistics features, intensity histograms, shape- and size-based features, texture-based features, and wavelet features [13]. Conceptually speaking, radiomics belongs to the field of machine learning, although human participation is needed. The basic hypothesis of radiomics is that the constructed descriptive model (based on medical imaging data, sometimes supplemented by biological and/or medical data) can provide predictions of prognosis or diagnosis [14]. A radiomics study can be structured in five steps: data acquisition and pre-processing, tumour segmentation, feature extraction, feature selection, and modelling [15][16] (Figure 2).
Figure 2. Five steps of the pipeline of radiomics.
The collection and pre-processing of medical images is the first step in the implementation of radiomics. Radiomics relies on well-annotated medical images and clinical data to build target models. CT was first used when radiomics was proposed
[7]. Subsequently, MRI
[17], positron emission tomography (PET/CT)
[18], and ultrasound
[19][20] have been widely used for image analysis of different tumours using radiomics. Image pre-processing (filtering and intensity discretization) is essential as these images often come from different hospitals or medical centres, which results in differences in image parameters, and such differences may have unexpected effects on the model.
Image segmentation is a distinctive feature of radiomics. The methods of image segmentation generally include manual segmentation and semi-automatic segmentation
[21].
Feature extraction is a technical step in the pipeline of radiomics, which is implemented in software such as MATLAB. The essence of radiomics is to extract high-throughput features that connect medical images and clinical endpoints from images. These details must be included in the article as the process of feature extraction is affected by the algorithm, methodology, and software parameter setting
[22][23]. The current radiomics pipeline typically incorporates approximately 50–5000 quantitative features, and this number is expected to increase
[24].
The process of modelling entails finding the best algorithm to link the selected image features with the clinical endpoints. Supervised and unsupervised learning are common strategies
[14][25]. The modelling strategy has been proven to affect the performance of the model
[26].
3. The Principle of DL
For a better understanding of DL, it is necessary to clarify the two terms of AI and machine learning, which are often accompanied by and confused with DL
[27] (
Figure 3). The concept of AI was first proposed by John McCarthy, who defined it as the science and engineering of intelligent machines
[28]. In 1956, the AI field was first formed in a Dartmouth College seminar
[29]. Currently, the content of AI has become much richer to include knowledge representation, natural language processing, visual perception, automatic reasoning, machine learning, intelligent robots, automatic programming, etc. The term AI has become an umbrella term
[30]. Machine learning is a technology used to realize AI. Its core idea is to use algorithms to parse and learn from data, then make decisions and predictions about events in the real world
[31], which is different from traditional software programs that are hard-coded to solve specific tasks
[32]. The algorithm categories include supervised learning algorithms, such as classification and regression methods
[33], unsupervised learning algorithms, such as cluster analysis
[34], and semi-supervised learning algorithms
[35]. DL is an algorithm tool for machine learning
[36]. It is derived from an artificial neural network (ANN), which simulates the mode of human brain processing information
[37], and uses the gradient descent method and back-propagation algorithm to automatically correct its own parameters, making the network fit the data better
[38][39]. Compared with the traditional ANN, DL has more powerful fitting capabilities owing to more neuron levels
[40]. According to different scenarios, DL includes a variety of neural network models, such as convolutional neural networks (CNNs) with powerful image processing capabilities
[41], recurrent neural networks (RNNs), which primarily process time-series samples
[42][43], and deep belief networks (DBNs), which can deeply express the training data
[44]. In recent years, CNN-based methods have gained popularity in the medical image analysis domain
[36][45][46]. In the studies of NPC imaging using DL models, CNN was adopted in almost all studies.
Figure 3. Relationship between artificial intelligence, machine learning, neural network, and deep learning. MLP: multilayer perception; CNN: convolutional neural network; RNN: recurrent neural network; DBN: deep belief network; GAN: generative adversative network.
There are four key ideas behind CNNs that take advantage of the properties of natural signals: local connections, shared weights, pooling, and the use of many layers
[37]. The structure of a CNN, which is mainly composed of an input layer, hidden layer, and output layer, is shown in
Figure 4.
Figure 4. Image processing principle of a convolutional neural network.
Because of the differences in the principles behind deep learning and radiomics, there are differences in the specific tasks and advantages of their implementation processes. Because implementations of radiomics require manual segmentation of lesion areas to capture the radiomics features, this approach is more often used to perform the tasks of diagnosis prediction, assessment of tumour metastasis, and prediction of therapeutic effect. Deep learning models are often based on the whole image, which contains information on the relationship between the tumour and the surrounding tissues. Therefore, image synthesis, lesion detection, prognosis prediction, and image segmentation are regarded more commonly as tasks suitable for deep learning methods. Because the input image of most deep learning tasks is often a full image, which contains the noise information around the lesion, the performance of deep learning models is thus far not as good as that of radiomics for the same dataset due to the embedded noise information. However, because radiomics retains the fundamental disadvantage that manually defining the area of interest is strictly required, which necessitates the performance of considerable human labour and this is not required by deep learning methods, the datasets available for deep learning tasks could be much larger than those of the radiomics task. In addition, with the rapid development of deep neural network algorithms, the performance of deep learning is gradually improving and its performance in many tasks now exceeds the performance of radiomics.
5. Studies Based on Radiomics
5.1. Prognosis Prediction
Prognosis prediction includes tumour risk stratification and recurrence/progression prediction. Among the 31 radiomics-based studies retrieved, 17 were on this topic (Table 1).
Table 1. Studies of predicting the prognosis of nasopharyngeal carcinoma (NPC) using radiomics.
Author, Year, Reference |
Image |
Sample Size (Patient) |
Feature Selection |
Modeling |
Model Evaluation |
Zhang, B. (2017) [13] |
MRI |
108 |
LASSO |
CR, nomograms, calibration curves |
C-index 0.776 |
Zhang, B. (2017) [47] |
MRI |
110 |
L1-LOG, L1-SVM, RF, DC, EN-LOG, SIS |
L2-LOG, KSVM, AdaBoost, LSVM, RF, Nnet, KNN, LDA, NB |
AUC 0.846 |
Zhang, B. (2017) [48] |
MRI |
113 |
LASSO |
RS |
AUC 0.886 |
Ouyang, F.S. (2017) [49] |
MRI |
100 |
LASSO |
RS |
HR 7.28 |
Lv, W. (2019) [50] |
PET/CT |
128 |
Univariate analysis with FDR, SC > 0.8 |
CR |
C-index 0.77 |
Zhuo, E.H. (2019) [51] |
MRI |
658 |
Entropy-based consensus clustering method |
SVM |
C-index 0.814 |
Zhang, L.L. (2019) [52] |
MRI |
737 |
RFE |
CR and nomogram |
C-index 0.73 |
Yang, K. (2019) [53] |
MRI |
224 |
LASSO |
CR and nomogram |
C-index 0.811 |
Ming, X. (2019) [54] |
MRI |
303 |
Non-negative matrix factorization |
Chi-squared test, nomogram |
C-index 0.845 |
Zhang, L. (2019) [55] |
MRI |
140 |
LR-RFE |
CR and nomogram |
C-index 0.74 |
Mao, J. (2019) [56] |
MRI |
79 |
Univariate analyses |
CR |
AUC 0.825 |
Du, R. (2019) [57] |
MRI |
277 |
Hierarchal clustering analysis, PR |
SVM |
AUC 0.8 |
Xu, H. (2020) [58] |
PET/CT |
128 |
Univariate CR, PR > 0.8 |
CR |
C-index 0.69 |
Shen, H. (2020) [59] |
MRI |
327 |
LASSO, RFE |
CR, RS |
C-index 0.874 |
Bologna, M. (2020) [60] |
MRI |
136 |
Intra-class correlation coefficient, SCC > 0.85 |
CR |
C-index 0.72 |
Feng, Q. (2020) [61] |
PET/MR |
100 |
LASSO |
CR |
AUC 0.85 |
Peng, L. (2021) [62] |
PET/CT |
85 |
W-test, Chi-square test, PR, RA |
SFFS coupled with SVM |
AUC 0.829 |
Least absolute shrinkage and selection operator (LASSO), L1-logistic regression (L1-LOG), L1-support vector machine (L1-SVM), random forest (RF), distance correlation (DC), elastic net logistic regression (EN-LOG), sure independence screening (SIS), L2-logistic regression (L2-LOG), kernel support vector machine (KSVM), linear-SVM (LSVM), adaptive boosting (AdaBoost), neural network (Nnet), K-nearest neighbour (KNN), linear discriminant analysis (LDA), and naive Bayes (NB).
5.2. Assessment of Tumour Metastasis
The authors in [63] developed an MRI-based radiomics nomogram for the differential diagnosis of cervical spine lesions and metastasis after radiotherapy. A total of 279 radiomic features were extracted from the enhanced T1-weighted MRI, and eight radiomic features were selected using LASSO to establish a classifier model that obtained an AUC of 0.72 with the validation set.
In [64], the authors explored the issue of whether there was a difference between radiomic features derived from recurrent and non-recurrent regions within the tumour. Seven histogram features and 40 texture features were extracted from the MRI images of 14 patients with T4NxM0 NPC. The author proposed that there were seven features that were significantly different between the recurrent and non-recurrent regions.
In 2021, the study of [62], which was introduced in the section on prognosis prediction, established a model for the assessment of tumour metastasis simultaneously. The best AUC for predicting tumour metastasis was 0.829 (Table 2).
Table 2. Studies for assessing tumour metastasis using radiomics.
Author, Year, Reference |
Image |
Sample Size |
Feature Selection |
Modeling |
Model Evaluation |
Zhang, L. (2019) [65] |
MRI |
176 |
LASSO |
LR |
AUC 0.792 |
Zhong, X. (2020) [63] |
MRI |
46 |
LASSO |
Nomogram |
AUC 0.72 |
Akram, F. (2020) [64] |
MRI |
14 |
Paired t-test and W-test |
Shapiro-Wilk normality tests |
p < 0.001 |
Zhang, X. (2020) [66] |
MRI |
238 |
MRMR combined with 0.632 + bootstrap algorithms |
RF |
AUC 0.845 |
Peng, L. (2021) [62] |
PET/CT |
85 |
W-test, PR, RA, Chi-square test |
SFFS coupled with SVM |
AUC 0.829 |
Maximum relevance minimum redundancy (MRMR).
5.3. Tumour Diagnosis
Lv [67] established a diagnostic model to distinguish NPC from chronic nasopharyngitis using the logistic regression of leave-one-out cross-validation method. A total of 57 radiological features were extracted from the PET/CT of 106 patients, and AUCs between 0.81 and 0.89 were reported.
In [68], 76 patients were enrolled, including 41 with local recurrence and 35 with inflammation, as confirmed by pathology. A total of 487 radiomic features were extracted from the PET images. The performance was investigated for 42 cross-combinations derived from six feature selection methods and seven classifiers. The authors concluded that diagnostic models based on radiomic features showed higher AUCs (0.867–0.892) than traditional clinical indicators (AUC = 0.817).
5.4. Prediction of Therapeutic Effect
In [69], 108 patients with advanced NPC were included to establish the dataset. The ANOVA/Mann–Whitney U test, correlation analysis, and LASSO were used to select texture features, and multivariate logistic regression was used to establish a predictive model for the early response to neoadjuvant chemotherapy. Finally, an AUC of 0.905 was obtained for the validation cohort.
5.5. Predicting Complications
In [70], a radiomics model for predicting early acute xerostomia during radiation therapy was established based on CT images. Ridge CV and recursive feature elimination were used for feature selection, whereas linear regression was used for modelling. However, the study’s test cohort included only four patients with NPC and lacked sufficient evidence, despite the study reaching a precision of 0.922.
The authors in [71] established three radiomics models for the early diagnosis of radiation-induced temporal lobe injury based on the MRIs of 242 patients with NPC. The feature selection in the study was achieved by the Relief algorithm, which is different from other studies. The random forest algorithm was used to establish three early diagnosis models. The AUCs of the models in the test cohort were 0.830, 0.773, and 0.716, respectively.
6. Studies Based on DL
6.1. Prognosis Prediction
Yang [72] established a weakly-supervised, deep-learning network using an improved residual network (ResNet) with three input channels to achieve automated T staging of NPC. The images of multiple tumour layers of patients were labelled uniformly. The model output a predicted T-score for each slice and then selected the highest T-score slice for each patient to retrain the model to update the network weights. The accuracy of the model in the validation set was 75.59%, and the AUC was 0.943.
In [73], a DL model based on ResNet was established to predict the distant metastasis-free survival of locally advanced NPC. In contrast to the studies published in 2020, the authors of this study removed the background noise and segmented the tumour region as the input image of the DL network. Finally, the optimal AUC of the multiple models combined with the clinical features was 0.808 (Table 7).
6.2. Image Synthesis
Tie
[74] used a multichannel multipath conditional generative adversarial network to generate CT images from an MRI. The network was developed based on a 5-level residual U-Net with an independent feature extraction network. The highest structural similarity index of the network was 0.92.
In
[75], a generative adversarial network was used to generate CT images based on MRIs to guide the planning of radiotherapy for NPC. The 2%/2 mm gamma passing rates of the generated CT images reached 98.68% (
Table 8).
6.3. Detection and/or Diagnosis
Two similar studies, [76][77], based on pathological images were conducted. The authors in [76] used 1970 whole slide pathological images of 731 cases: 316 cases of inflammation, 138 cases of lymphoid hyperplasia, and 277 cases of NPC. The second study used 726 nasopharyngeal biopsies consisting of 363 images of NPC and 363 of benign nasopharyngeal tissue [77]. In [76], Inception-v3 was used to build the classifier, while ResNeXt, a deep neural network with a residual and inception architecture, was used to build the classifier in [77]. The AUCs obtained in [76][77] were 0.936 and 0.985, respectively.
6.4. Segmentation
Radiotherapy is the most important treatment for NPC. However, it is necessary to accurately delimit the nasopharyngeal tumour volume and the organs at risk in images of the auxiliary damage caused by radiotherapy itself. Therefore, segmentation is particularly relevant to DL in NPC imaging.
Li
[78] proposed and trained a U-Net to automatically segment and delineate tumour targets in patients with NPC. A total of 502 patients from a single medical centre were included, and CT images were collected and pre-processed as a dataset. The trained U-Net finally obtained DSCs of 0.659 for lymph nodes and 0.74 for primary tumours in the testing set.
Bai
[79] fine-tuned a pre-trained ResNeXt-50 U-Net, which uses the recall preserved loss to produce a rough segmentation of the gross tumour volume of NPC. Then, the well-trained ResNeXt-50 U-Net was applied to the fine-grained gross tumour volume boundary minute. The study obtained a DSC of 0.618 for online testing (
Table 10).
7. Deep Learning-Based Radiomics
DL has shown great potential to dominate the field of image analysis. In ROI [80] and feature extraction tasks [81][82], which lay in the implementation pipeline of radiomics, DL has achieved good results. After completing the model training, DL can automatically analyse images, which is one of the greatest strengths compared to radiomics. Many researchers have introduced DL into radiomics (termed deep learning-based radiomics, DLR) and achieved encouraging results [83]. This may be a trend for the application of AI tools in medical imaging in the future.
7.1. Studies Based on Deep Learning-Based Radiomics (DLR)
In [84], Zhang innovatively combined the clinical features of patients with nasopharyngeal cancer, the radiomic features based on MRIs, and the DCNN model based on pathological images to construct a multi-scale nomogram to predict the failure-free survival of patients with NPC. The nomogram showed a consistent significant improvement for predicting treatment failure compared with the clinical model in the internal test (C-index: 0.828 vs. 0.602, p < 0.050) and external test (C-index: 0.834 vs. 0.679, p < 0.050) cohorts. (Table 11)
8. Future Work
Research on radiomics and DL in NPC imaging has only started in recent years. Therefore, there are still many issues that need further research in the future: linking NPC imaging features with tumour genes/molecules to promote the development of precision medicine for non-invasive, rapid, and low-cost approaches; using multi-stage dynamic imaging to assess tumour response to drugs/radiotherapy and predict the risk of radiation therapy in surrounding vital organs to guide treatment decisions; and bridging the gap from the AI tools established in studies to clinical applications. In addition, current studies based on nasal endoscopic images and pathological images are lacking. In particular, accurate and rapid screening of NPC is of great significance, considering that endoscopic images are usually the primary screening images for most patients. Further high-quality research in this regard is needed. Finally, there is still a lack of large-scale, comprehensive, and fully labelled datasets for NPC; datasets similar to those that are available for lung and brain tumours. The establishment of large-scale public datasets is an important task in the future.
This entry is adapted from the peer-reviewed paper 10.3390/diagnostics11091523