Historically, brain tumor classification has been solely based on histopathological features
[5], whereas the latest editions incorporate genetic and epigenetic information, such as molecular markers (e.g., IDH mutation, 1p/19q codeletion, etc.) and DNA methylation profiles
[6][7]. The genetic and epigenetic makeups define the molecular signature, a “barcode” of the tumor, whose recognition is essential for clinical decision-making in the era of targeted therapies
[8]. Therefore, tissue sampling remains the gold standard for decoding the molecular landscape of most CNS tumors, especially for gliomas
[9]. Nevertheless, growing evidence has highlighted the powerful role of artificial intelligence in oncological neuroimaging through the extraction of quantitative information from routine radiological examinations
[10]. Alongside the molecular signature is the imaging signature, which offers complementary and ideally additional information for the characterization of the brain tumor, with a potential role in guiding the choice of the most appropriate therapy and clinical management
[11]. In this landscape, AI-assisted tools represent the bridge from precision diagnostics to precision therapeutics
[12].
AI can be defined as technology that mimics human cognitive processes, such as learning, reasoning, and problem-solving. Developed as a branch of computer science, present-day AI is a broad field of knowledge that welcomes contributions from different disciplines, such as statistics, informatics, and physics.
3. Lesion Detection and Differential Diagnosis
AI-powered tools can aid neuroradiologists in lesion detection and differential diagnosis.
Since gliomas are often diagnosed when they are large and symptomatic, the detection of glioma-like lesions on MRI may seem relatively trivial to an experienced neuroradiologist. Conversely, the early diagnosis of small brain metastases (BM) in oncological patients during follow-up is challenging, because sensitivity on MRI is variable, and many details of MRI acquisition can impact the performance
[19]. However, since stereotactic radiosurgery protocols and other therapeutic decisions are based on the number and location of even small metastases, early diagnosis is a real concern for neuroradiologists, given the high impact on the patient’s prognosis. For this reason, most of the computer-aided detection (CAD) tools available in the field of neuro-oncology focus primarily on the automated detection of brain metastases.
The proper tuning of CAD tools is essential to ensure diagnostic accuracy, lowering the risk of overdiagnosis, overtreatment, and unreasonable concern in patients
[20]. Generally speaking, if the threshold sensibility is too low, the model can be affected by a high false-positive rate, for example, including vascular structures instead of small metastases; on the other hand, when the threshold is high, the model can fail to detect small (in particular, <3 mm) lesions
[21].
Park et al. have recently demonstrated how DL-based models significantly increase the diagnostic accuracy in the detection of small lesions by exploiting the integration of large amounts of MRI data: in particular, a DL model that combines 3D Black Blood and 3D GRE MRI sequences outperformed a DL model using only 3D GRE sequences in the detection of brain metastases (
p < 0.001), yielding a sensitivity of 93.1% versus 76.8%
[22].
Solitary BM and GBM can exhibit quite similar MRI features, such as post-contrast ring enhancement, necrotic core, and large peritumoral edema presenting with high signal on T2-weighted and FLAIR images
[23]. Differentiating these two entities is essential, considering they are the most common brain tumors in the adult population and have quite different treatments
[23]. Thus, several researchers have focused on this topic, showing the advantages of multiparametric MRI
[24][25] and, more recently, evaluating the performances of different AI-based classifiers compared to expert neuroradiologists.
For example, Swinburne et al. investigated whether an ML algorithm including advanced MRI (advMRI) data from 26 patients can reliably differentiate between GBMs (n = 9), BM (n = 9), and primary central nervous system lymphoma (PCNSL) (n = 8). Their multilayer perceptron model performed well in discriminating between the three pathological classes. After adopting a leave-one-out cross-validation strategy, the model achieved a maximum accuracy of 69.2%, intermediate to that of two human readers (65.4% and 80.8%). However, the use of the same model for cases where human reviewers disagreed on the diagnosis yielded an increase of 19.2% incorrect diagnoses. No evaluation with an independent test cohort was carried out in this study, and this represents the main limitation of this study
[26].
Since the contrast enhancement and local infiltration of white matter bundles are key features of high grade-gliomas (HGGs)
[27], most ML and DL algorithms exploit radiomic features extracted on post-contrast T1-weighted 3D images or diffusion-weighted images (DWI) and related techniques, such as diffusion tensor imaging (DTI).
For example, a recent study based on DTI metrics, especially fractional anisotropy (FA) and ADC values, demonstrated that peritumoral alteration is different in these two entities, with GBM showing greater heterogeneity due to the infiltrative nature and aggressive tumor
[1][28]
The combination of radiomic and non-radiomic features (clinical and qualitative imaging) has in some cases been shown to be better than using radiomic features alone. For example, a study by Han et al., established the importance of adding clinically relevant data (e.g., age and sex) and routine radiological indices (tumor size, edema ratio, and location) to build an AI-driven model to differentiate between GBM and BM from lungs and other sites using a logistic regression model; the integrated model was superior to the single model
[29].
BM can be the first manifestation of a still unknown extracerebral malignancy; therefore, ML tools have been applied in the clinical scenario in which patients are found with brain metastases without a known primary site of cancer
[30]. Metastases coming from different primary cancers show differences in the local environments and consequently exhibit different radiomic features
[12]. Ortiz-Ramón et al. provided good results in differentiating metastases from lung cancers, melanoma, and breast cancers when they implemented an AI-driven model with two- and three-dimensional texture analyses of T1-weighted post-contrast sequences within a nested cross-validation structure after quantizing the images with multiple numbers of gray-levels to evaluate the influence of quantization
[31].
Another challenging differential diagnosis is between GBM and PCNSL since these entities may show similar appearances on conventional MRI, especially when GBMs do not present a necrotic core and the enhancement is not confined to the peripheral area but is more homogeneous
[32][33]. Generally, PCNSLs are treated with whole-brain chemotherapy and radiotherapy, while GBM commonly undergoes surgical resection before chemo-radiotherapy
[34]; therefore, a proper diagnosis is mandatory.
A recent study by Stadlbauer et al.
[35] analyzed the effectiveness of a multiclass ML algorithm that integrates several radiomic features extracted from advanced MRI (including axial diffusion-weighted imaging sequences and a gradient echo dynamic susceptibility contrast (GE-DSC) perfusion) and physiological MRI (protocol including the vascular architecture mapping (VAM) and the quantitative blood-oxygenation-level-dependent (qBOLD)) to classify the most common brain enhancing-tumors: (GBM, anaplastic glioma, meningioma, PCNSL, or brain metastasis). When compared to the human reader, the AI-driven algorithms achieved a better performance, resulting in superior accuracy (0.875 vs. 0.850), precision (0.862 vs. 0.798), F-score (0.774 vs. 0.740), and AUROC (0.886 vs. 0.813); however, the radiologists demonstrated higher sensitivity (0.767 vs. 0.750) and specificity (0.925 vs. 0.902).
The DL paradigm has evolved in recent years as a big data grinding machine and has replaced many conventional algorithms in the field of image analysis as well. Furthermore, the development of open-source web platforms for programming DL models has expanded the frontiers of collaborative research in the development and validation of new DL-based tools. A good example is provided by Ucuzal et al., who developed web-based DL software aimed at the differential diagnosis of brain tumors using the popular Python programming language and the dedicated Keras library. Their software accepts multiple formats of the images, such as .jpeg, .jpg, and .png, and can be used to classify the input MRI image datasets into three diagnostic classes: meningioma, glioma, and pituitary tumors
[36].
CNNs have a significant drawback in that they underutilize spatial relationships between the tumor and its surroundings, which is especially detrimental for classifying tumors. K. Adu and Y. Yu recently proposed a dilated capsule network model (CapsNet model), which is an extension of the traditional CNN, to address this issue
[37]. In this model, the “routing by agreement” layer in the dilated CapsNet architecture takes the place of the pooling layer in the current CNN architecture
[37]. Afshar et al. proposed a modified CapsNet architecture for classifying brain tumors that incorporates additional inputs into its pipeline from tissues surrounding the tumor, without detracting from the primary target, yielding satisfactory results
[38].
Most AI-based classification algorithms target supratentorial tumors. In the posterior fossa, on the other hand, the two most common lesions in the adult population are hemangioblastoma, a benign tumor of vascular origin with a good survival rate, and brain metastases
[39][40]. Obviously, discrimination between these entities is crucial for patient management as once again the therapeutic approach is different. In this field the role of AI-based is not yet well defined, however, a recent study attempted the differential diagnosis of intra-axial lesions of the posterior fossa using different radiomic algorithms (CNN, SVM, etc.), with promising results
[1][41].
In some cases, even the differentiation between tumoral and non-tumoral processes is not simple. Tumefactive multiple sclerosis lesions, infection, inflammation disease (paraneoplastic syndrome and autoimmune disease), cortical dysplasia, and even stroke may be confused with tumoral processes, and accurate differential diagnosis based only on the radiological appearance is impossible due to the overlapping radiological features
[42].
For example, tumefactive multiple sclerosis is a great mimicker of HGG on conventional MRI. The use of an AI-assisted tool can help the neuroradiologist to improve the differential diagnosis
[1]: a recent study by Verma et al. achieved good results in differentiating GBMs from PCNSL from tumefactive multiple sclerosis lesions using an in-house software called dynamic texture parameter analysis (DTPA), which incorporates the analysis of quantitative texture parameters extracted from dynamic susceptibility contrast-enhanced (DSCE) sequences
[43]. A more recent study by Han et al. evaluated the performance of different radiomic signature models in differentiating between low-grade glioma (LGG) and inflammation using radiomic features extracted from T1-weighted (T1WI) and T2-weighted (T2WI) MRI images. The features were chosen after a
t-test and statistical regression (LASSO algorithm) to develop three radiomic models based on T1WI, T2WI, and combination (T1WI + T2WI), using, respectively four, eight, and five radiomic features each. The T2WI and combination models achieved better diagnostic efficacy in both the primary cohort and the validation cohort, significantly outperforming radiologist assessments
[44].
4. Tumor Characterization
In the era of molecular therapies, diagnostic neuroimaging should guide the diagnosis and treatment planning of brain tumors through a non-invasive characterization of the lesion, sometimes also called “virtual biopsy”, based on radiomic and radiogenomic approaches
[11].
To date, most studies have challenged ML models to address very general classification tasks for brain tumors, such as differentiating between GBM and brain metastases
[45][46]. However, more recently, researchers focused on the development of AI-driven tools, aiming to recognize the radiological signature of the tumor to provide a comprehensive analysis of the grading, genomic and epigenomic landscape of cerebral gliomas, which is extremely useful for decision-making towards a personalized medicine perspective. Therefore, several studies have been published in recent years where AI algorithms are challenged in increasingly specific classification tasks, such as differentiation within different subgroups of gliomas, for example, low-grade gliomas (LGGs) compared to high-grade gliomas (HGGs)
[47][48]; isocitrate dehydrogenase (IDH) wild-type (IDH(−)) vs. IDH-mutated (IDH(−))
[49]; 1p/19q chromosomal arm deletion
[50]; and others.
Several studies have focused on glioma grading. For example, Cho et al. used a radiomics approach to test the performance of various ML classifiers in determining the grading of 285 glioma cases (210 HGG, 75 LGG) obtained from the Brain Tumor Segmentation 2017 Challenge. The researchers extracted a large set of radiomic features from routine brain MRI sequences, including T1-weighted, T1-weighted contrast-enhanced, T2-weighted, and FLAIR. Three supervised ML classifiers showed an average AUC of 0.9400 for training cohorts and 0.9030 (logistic regression 0.9010, support vector machine 0.8866, and random forest 0.9213) for test cohorts
[51].
In another study, Tian et al. investigated the role of radiomics in differentiating grade II gliomas from grade III and IV; they extracted radiomics features from conventional, diffusion, and perfusion arterial spin labeling (ASL) MRI. After multiparametric MRI preprocessing, high-throughput texture and histogram parameters features were derived from patients’ volumes of interest (VOIs). Then, the support vector machine (SVM) classifier showed good accuracy/AUC (96.8%/0.987) for classifying LGGs from HGGs, and 98.1%/0.992, respectively, for classifying grades III from IV. Furthermore, they proved that texture features were more effective for non-invasively grading gliomas than histogram parameters
[52].
Mzoughi et al. proposed a fully automatic deep multi-scale 3D CNN architecture for MRI gliomas brain tumor classification into low-grade gliomas and high-grade gliomas, using the whole volumetric T1 contrast-enhancement MRI sequence. For effective training, they used a data augmentation technique. After data augmentation and proper validation, the proposed approach achieved 96.49% accuracy, confirming that adequate MRI pre-processing and data augmentation could lead to the development of an accurate classification model when exploiting CNN-based approaches
[53].
Chang et al. used CNNs for the differential diagnosis between IDH-mutant and IDH wild-type gliomas on conventional MRI imaging, achieving 92% accuracy; these results were in line with prior hypotheses based on visual assessment and underlying pathophysiology, as IDH wild-type lesions are characterized by more infiltrative and ill-defined borders. Furthermore, the authors found that nodular and heterogeneous contrast enhancement and “mass-like FLAIR edema” could aid in the prediction of MGMT methylation status, with up to 83% accuracy
[54].
In another study, Kim et al. aimed to evaluate the added value of radiomic features extracted from MRI DWI and perfusion sequences in the prediction of IDH mutation and tumor grading in LGGs. For the IDH mutation, the model trained with multiparametric features showed similar performance to the model based on conventional sequences, but in tumor grading, it showed higher performance. This trend was confirmed in the independent validation set, demonstrating that DWI features and especially the apparent diffusion coefficient (ADC) map play a significant role in tumor grading
[49].
In one of the first studies in the field, Akkus et al. presented a non-invasive method to predict 1p/19q chromosomal arm deletion from post-contrast T1- and T2-weighted MR images using a multi-scale CNN. They found that increased enhancement, infiltrative margins, and left frontal lobe predilection are associated with 1p19q codeletion with up to 93% accuracy
[50].
In a larger, recent retrospective study, Meng et al. specifically targeted ATRX status in 123 patients diagnosed with gliomas (World Health Organization grades II–IV) using radiomics analysis, showing that radiomic features derived from preoperative MRI facilitate the efficient prediction of ATRX status in gliomas, achieving an AUC for ATRX mutation (ATRX(−)) of 0.84 (95% CI: 0.63–0.91) on the validation set, with a sensitivity, specificity, and accuracy of 0.73, 0.86, and 0.79, respectively
[55].
In another retrospective study by Ren et al., researchers focused on the non-invasive prediction of molecular status for both IDH1 mutation and ATRX expression loss in LGGs, exploiting a radiomic approach based on high-throughput multiparametric MRI radiomic features. An optimal features subset was selected using a support vector machine (SVM) algorithm and ROC curve analysis was employed to assess the efficiency for the identification of the IDH1(+) and ATRX (−) status. Using 28 optimal texture features extracted from multiple MRI sequences, the SVM predictive model achieved excellent performances in terms of accuracies/AUCs/sensitivity/specificity/PPV/NPV in the prediction of IDH1(+) (94.74%/0.931/100%/85.71%/92.31%/100%, respectively) and ATRX (−) within LGGs (91.67%/0.926/94.74%/88.24%/90.00%/93.75%)
[56].
Recently, some more ambitious studies have investigated the diagnostic accuracy of a radiomic approach in evaluating both the grading and the complete molecular profile of cerebral gliomas
[57]. For instance, Habould et al. integrated clinical and laboratory data into a completely automated segmentation-based radiomics tool for the prediction of molecular status (ATRX, IDH1/2, MGMT, and 1p19q co-deletion), also distinguishing low-grade from high-grade gliomas. The system provided an AUC (validation/test) of 0.981 ± 0.015/0.885 ± 0.02 for the grading task. The prediction of the ATRX (−) condition had the best results, with an AUC of 0.979 ± 0.028/0.923 ± 0.045, followed by the prediction of IDH1/2(+), with an AUC of 0.929 ± 0.042/0.861 ± 0.023, while they showed only moderate results for the prediction of 1p19q and MGMT status
[58].
In a similar study, Shboul et al. performed a non-invasive analysis of 108 pre-operative LGGs using imaging features to predict the status of MGMT methylation, IDH mutations, 1p/19q co-deletion, ATRX mutation, and TERT mutations, achieving a good accuracy with AUC of 0.83 ± 0.04, 0.84 ± 0.03, 0.80 ± 0.04, 0.70 ± 0.09, and 0.82 ± 0.04
[59].
A recent study focused on the detailed analysis of the tumor landscape within HGGs, highlighting the outstanding potential of DL algorithms in the extraction of new imaging markers, otherwise impossible to evaluate visually or with traditional radiomics approaches. Calabrese et al. retrospectively analyzed preoperative MRI data from 400 patients with WHO grade 4 glioblastoma or astrocytoma, who underwent resection and genetic testing to assess the status of nine key biomarkers: hotspot mutations of IDH1 or TERT promoter, pathogenic mutations of TP53, PTEN, ATRX, or CDKN2A/B, MGMT promoter methylation, EGFR amplification, and combined aneuploidy of chromosomes 7 and 10. An AI-driven model was tested in the prediction of biomarker status from MRI data using radiomics features, DL-based CNN features, and a combination of both. The results showed that the combination of radiomics and CNN features from preoperative MRI yields improved non-invasive genetic biomarker prediction performance in patients with WHO grade 4 diffuse astrocytic gliomas
[60].