Radiomic and Radiogenomic Pipelines

Radiomic and Radiogenomic Pipelines: Comparison

Please note this is a comparison between Version 1 by Marwa Ismail and Version 2 by Jason Zhu.

Advances in artificial intelligence have greatly impacted the field of medical imaging and vastly improved the development of computational algorithms for data analysis. In the field of pediatric neuro-oncology, radiomics, the process of obtaining high-dimensional data from radiographic images, has been recently utilized in applications including survival prognostication, molecular classification, and tumor type classification. Similarly, radiogenomics, or the integration of radiomic and genomic data, has allowed for building comprehensive computational models to better understand disease etiology.

medulloblastoma
radiomics
risk stratification
radiogenomics

1. Segmentation of Pediatric Medulloblastoma (MB) Tumors

Radiation treatment planning in MB tumors requires careful tumor delineation. Similarly, treatment response assessment in MB requires tumor delineation to reliably compute measurements in two perpendicular planes (bidirectional or 2D). Automated segmentation tools could substantially augment treatment planning in pediatric MB. Recently, deep-learning architectures, including Fully Convolutional Networks and U-Net ^[1][2][3][4][41,43,46,47], have allowed for the development of reliable and fully automated segmentation approaches for various types of solid tumors, including adult brain tumors ^[3][46]. These approaches focus on building fully convolutional encoder-decoder networks without fully connected layers to achieve end-to-end tumor segmentation ^[4][47]. However, deep learning has only recently been employed for the automated segmentation of pediatric brain tumors in a handful of studies ^{[1][2][5][6][7]}[41,42,43,44,45]. The reported dice scores from the non-enhancing tumor, necrosis, and edema sub-compartments in these studies have been sub-optimal, which underlines the challenges with segmenting pediatric brain tumors. For instance, Peng et al. ^[5][42] developed a deep-learning network to automatically segment the tumors of high-grade gliomas, MB, and other leptomeningeal diseases in pediatric patients, on T1w contrast-enhanced and T2/FLAIR images. Similarly, the work in ^[1][41] employed a convolutional neural network (CNN)-based model to segment the sub-compartments of multiple pediatric brain tumors, primarily gliomas and included a limited cohort of MB cases (n = 24). The model processed images at multiple scales simultaneously using a dual pathway. The first pathway kept the images at their normal resolution, while the second pathway down-sampled them. While the model was able to differentiate between the enhancing and non-enhancing tumor compartments of MB tumors, the reported dice scores were relatively low (0.62 for enhancing tumor, 0.18 for edema, and 0.26 for non-enhancing tumor). It is generally noted that the area of pediatric MB tumor segmentation is understudied, and more automated, reliable approaches are needed on this front toward more effective radiation therapy planning in pediatric MB.

2. Survival Prognostication in Pediatric MB Using Radiomic Approaches

2.1. Feature Extraction and Selection

The different approaches found in the literature in the context of MB risk stratification utilized a wide range of radiomic features on different MRI protocols, including T1w, CE-T1w, Gadolinium-enhanced T1w (Gd-T1w), T2w, FLAIR images, and perfusion imaging. For instance, Grist et al. ^[8][28] and Yan et al. ^[9][29] utilized the Apparent Diffusion Coefficient (ADC) maps from Dynamic Susceptibility Contrast (DSC) MRI perfusion images to predict survival in MB. Specifically, Grist et al. utilized the ADC maps along with T2w images and diffusion-weighted images (DWI) to extract imaging features from 17 MB cases. The feature set included statistics from the ADC maps (e.g., kurtosis, mean, etc.), mean of corrected Cerebral Blood Volume (CBV), and mean of uncorrected CBV (uCBV), in addition to the postoperative tumor volume to stratify patients into low- and high-risk groups. Similarly, Yan et al. ^[9][29] utilized the ADC maps, yet along with multiple MRI protocols, including T1w, CE-T1w, T2w, and FLAIR, to extract 5929 radiomic features (shape features, first-order intensity features, and higher-order texture features). Next, Intraclass Correlation Coefficient (ICC) was used for feature reduction before feeding them into the risk stratification statistical model.

Liu et al. ^[10][30] and Zheng et al. ^[11][31] focused on routine MR imaging in their analysis. For instance, Liu et al. ^[10][30] constructed a radiomic model on multi-institutional data that comprised 253 MB pediatric patients, with a training cohort and two hold-out test sets. Specifically, a total of 1294 radiomic features were extracted from T1w images as well as contrast-enhanced T1w (CE-T1w) images (647 features from each modality) that include size, shape, and textural features. Feature selection was then conducted using Pearson’s correlation coefficient. Zheng et al. ^[11][31] constructed a radiomic model for risk stratification on a total of 111 children with pathologically confirmed MB. One thousand one hundred thirty-two radiomic features were extracted from CE-T1w images that include first-order statistics, volume, shape, gray-level co-occurrence matrix (GLCM), gray-level run-length matrix, gray-level size zone matrix, and gray-level dependence matrix. Feature reduction was then conducted using ICC.

Interestingly, Iyer et al. ^[12][32] explored radiomic features outside of the tumor that can help quantify the mass effect occurring in the healthy “brain around tumor” regions due to the tumor pushing and displacing the neighboring structures. Specifically, Gadolinium-enhanced T1-weighted (Gd-T1w) images of 88 MB patients were analyzed, where local tissue deformation heterogeneity features captured from the “brain around tumor” regions were extracted. These features were analyzed to identify differences between high-risk patients with highly heterogenous tumors and low-risk patients that have less heterogeneous tumors for the purpose of survival prediction.

2.2. Statistical Models for Survival Prognostication

Most works for MB survival prediction have utilized logistic regression models and Cox proportional hazards models to risk-stratify MB patients. For instance, Grist et al. ^[8][28] utilized Cox regression analysis along with iterative Bayesian survival analysis to select the top features extracted from the ADC maps and the other modalities for survival prognostication. Additionally, both unsupervised machine learning (using K-means clustering) and supervised machine learning (using the Bayesian features with an RF classifier, a single-layer neural network, and an SVM classifier were employed for risk stratification with 10-fold cross-validation. The unsupervised clustering technique yielded an elevated Hazard Ratio (HR) of 5.6, confidence intervals of 1.6–20.1, and p < 0.001 for the high-risk patients. Applying supervised machine learning techniques that employed the Bayesian features combined with a single-layer neural network with 10-fold cross-validation provided an accuracy of 98% in risk stratification.

Iyer et al. ^[12][32] constructed their survival model from the deformation heterogeneity deformation features, along with Chang’s stratification components for the MB subjects as well as their molecular subgroup information using multivariate logistic regression models. The radiomic deformation features yielded significant differences between low- and high-risk patients (p =

2.9 \times 10^{- 4}

, Concordance Index (CI) = 0.7). Interestingly, the deformation features combined with Chang’s classification and molecular stratification yielded the best results in risk-stratifying patients into low- and high-risk (p =

0.005

, CI = 0.75).

Liu et al. ^[10][30] employed Cox regression analysis and LASSO regression on their set of selected radiomic features to identify the features with the most prognostic value. A radiomic signature was constructed based on this set of features to predict progression-free survival (PFS) and overall survival (OS). Kaplan–Meier analysis and the log-rank test revealed that the prognostic model yielded C-indices of 0.71, 0.7, and 0.72 on the training and the hold-out test sets 1 and 2, respectively. Further, a radiomics nomogram that integrates the radiomic features, age, and metastasis was constructed and performed better than the nomogram incorporating only clinicopathological factors (C-index = 0.723 vs. 0.665 and 0.722 vs. 0.677 on the held-out test sets 1 and 2, respectively). Similarly, Yan et al. ^[9][29] employed LASSO regression and Cox proportional hazards regression for the identification of the top features for survival prediction. Clinicomolecular factors, comprising age, sex (female or male), Karnofsky Performance Status (KPS), molecular subgroups (WNT, SHH, Group 3 or Group 4), the extent of resection (complete or incomplete), radiation therapy (yes or no), and chemotherapy (yes or no) were also incorporated into the survival prediction models. The Wilcoxon test and chi-square test were used to assess differences in survival between the risk groups. Kaplan–Meier analysis, along with the log-rank test, revealed that the radiomics-clinicomolecular signature predicted OS (C-index = 0.762) and PFS (C-index = 0.697) better than either the radiomics signature (C-index: OS: 0.649; PFS: 0.593) or the clinicomolecular signature (C-index: for OS = 0.725; for PFS = 0.691) alone.

Zheng et al. ^[11][31] also employed multivariate Cox regression and LASSO models to create a radiomic signature for risk stratification and obtain a radiomic score for each subject by using a linear combination of selected radiomics features and their weighted coefficients. Additionally, an integrative model combining radiomic features, clinical features, and conventional MRI features was constructed. The models were then evaluated using Kaplan–Meier analysis and C-indices. The radiomic features combined with clinical and conventional MRI features yielded the best results for predicting OS (C-index = 0.82) compared to using the radiomic signature alone (C-index = 0.7) in the training set. On the test set, C-indices were 0.78 and 0.75 using the integrative model and the radiomic model, respectively. This was observed in other works as well ^[9][10][12][29,30,32], where integrating the radiomic features with clinical and molecular parameters improves the performance of the risk stratification models rather than using any of these parameters alone.

3. Molecular Subgroup Identification in Pediatric MB Using Radiomic Approaches

3.1. Feature Extraction and Selection

In the context of identifying the 4 MB molecular subgroups, several models have utilized textural analysis on the tumor regions to identify differences between the subgroups. ^{[13][14][15][16][17][18][19][20]}[33,34,35,36,37,38,39,40] For instance, Chang et al. ^[13][33] attempted to find the imaging surrogates of the 4 MB molecular subgroups using radiomic analysis in a study of 38 MB patients. Specifically, a total of 253 radiomic features that include tumor intensity, shape and size, and texture features were extracted from five different imaging sequences (T1w, T2w, FLAIR, ADC, and CE-T1w). This was followed by applying different feature selection algorithms, including minimum redundancy, maximum relevance, sequential backward elimination, and sequential forward selection to obtain the best future combination. Similarly, Iv et al. ^[14][34] developed a computational framework to predict the molecular subgroups of 109 MB patients collected from three different sites, where 590 radiomic features were extracted from T1w and T2w MR images. Namely, the features included intensity-based histograms, tumor edge-sharpness, Gabor features, and local area integral invariant features. A non-parametric Wilcoxon rank sum test was used for feature selection. Additionally, Saju et al. ^[15][35] employed texture analysis on the CE-T1w and T2w MR images of 38 MB patients, where features that included first- and second-order GLCM and shape features were extracted. Feature reduction was then conducted using LASSO regression.

In a similar fashion, Wang et al. ^[16][36] attempted to predict SHH and Group 4 subgroups on 95 MB patients (divided in the ratio of 7:3 for training: test sets) using their T1w, T2w, CE-T1w, and FLAIR sequences in addition to their ADC maps. Specifically, 7045 radiomic features that include intensity statistics, texture features that quantify the tumor heterogeneity (e.g., gray-level run-length and gray-level co-occurrence), shape and size, and high-order statistical features (using various filters such as exponential, logarithmic, square, square root, and wavelet) were extracted from the image sequences. This was followed by employing feature reduction algorithms to remove the redundant features, such as variance threshold, SelectKBest, and the LASSO regression model. Yan et al. ^[17][37] developed a radiomic model to predict the molecular subgroups of 122 MB subjects. Five thousand five hundred twenty-nine radiomic features were extracted from T1w, CE-T1w, T2w, and FLAIR MR images, in addition to the ADC maps of these patients. Namely, tumor location, shape features, intensity-based features, and texture features were extracted. The texture and intensity features were extracted from both the MR images and the transform-domain images. Feature pruning was then employed using ICC to remove the redundant features. Finally, Zhang et al. ^[18][38] constructed a model to identify the 4 molecular subgroups of 263 MB patients using their CE-T1w and T2w MR images. Specifically, 1800 radiomic textural features were extracted and then reduced using LASSO regression.

Aside from textural analysis, Iyer et al. ^[12][32] conducted a statistical approach to identify differences between the 4 MB subgroups on the Gd-T1w images of 71 patients. After extracting radiomic features that quantify the structural deformations occurring in the “brain around tumor” regions due to mass effect, this was followed by statistical analysis to classify the four subgroups. Also, Dasgupta et al. ^[19][39] conducted a study on 111 MB patients to predict their molecular subgroups from T1w, T2w, and diffusion imaging. Specifically, imaging features such as tumor location and size, diffusion characteristics, tumor margin, and T2w characteristics were extracted. A correlation between those individual features and the molecular subgroup was then established using statistical methods.

Interestingly, a deep learning approach was previously adopted for the task of MB molecular stratification. Specifically, Chen et al. ^[20][40] developed a multi-tasked CNN-based approach that utilizes different information, including genotyping and prognosis, to predict the molecular subgroup of 113 MB patients. Using the tumor mask, this multi-staged model employed feature extraction from CE-T1w and T2w MR images using a ResNet model, region proposal, and subgroup prediction. The ResNet model used pyramid representations to construct feature pyramids, which were then used in the second stage to obtain region proposals that contain tumor lesions. Finally, each feature map of a region proposal was transformed into fixed spatial dimensions for the tasks of molecular subgroup prediction, prognosis, and tumor segmentation in a multi-task learning technique. In a 3-fold cross-validation scheme, the molecular subgroup prediction task, with the assistance of tumor segmentation and prognosis tasks, achieved AUCs of 0.96, 0.96, 0.99, and 0.96 for WNT, SHH, Group 3, and Group 4 subgroups, respectively.

3.2. Statistical Models for Molecular Subgroup Identification

Most of the approaches utilized in the context of MB molecular subgroup identification have employed logistic regression along with machine learning classifiers ^{[13][14][15][16][17][18]}[33,34,35,36,37,38]. For instance, Chang et al. ^[13][33] implemented an SVM classifier with nested leave-one-out cross-validation (LOOCV) to find the best model from their extracted set of texture features. Based on the selected set of features (8 GLCM features), a prediction model was constructed, which generated (AUC) values of 0.82, 0.72, and 0.78 for WNT, Group 3, and Group 4, respectively. Similarly, IV et al. ^[14][34] employed an SVM classifier for the molecular subgroup prediction task using a cross-validation strategy. From their set of 590 radiomic features, the tumor edge-sharpness feature was found to be the most discriminative feature between SHH and Group 4 molecular subgroups. It is noted that the scans were acquired from different scanner vendors with different imaging parameters, which may affect the model’s performance metrics. In order to account for these variations, the authors have conducted extensive validation on their cohorts. Specifically, two predictive models were developed; one was based on a double 10-fold cross-validation scheme, where the subjects from the three datasets were combined, whereas the other model employed a three-dataset cross-validation strategy, where the model was trained using two datasets and tested on the third independent cohort. The 10-fold cross-validation model applied on the MRI modalities combined (T1w, T2w) yielded AUCs of 0.79, 0.7, and 0.83 for predicting SHH, Group 3, and Group 4 subgroups, respectively. Similarly, the 3-dataset cross-validation strategy resulted in predicting the SHH group with an AUC of 0.7–0.73 as well as Group 4 with an AUC of 0.76–0.8. In the work by Saju et al. ^[15][35], an SVM classifier was also employed in a 10-fold cross-validation strategy for model development. The authors used both One-vs-One and One-vs-All multiclass classification approaches for evaluation. Multiple models were sequentially evaluated by the system using a combination of the selected features to find the best predictive model. The best model was obtained by using a combination of 30 GLCM and six shape features on CE-T1w MR images. A 10-fold cross-validation demonstrated AUCs of 0.93, 0.9, 0.93, and 0.93 in predicting WNT, SHH, Group 3, and Group 4 MB subgroups, respectively.

In the work by Wang et al. ^[16][36], based on the feature reduction step employed on the 7045 extracted features, a total of 17 optimal features were used to develop the classification model, which yielded classification accuracies with AUCs of 0.96 and 0.75 in the training and the test cohorts, respectively. Interestingly, when combining the radiomic features with the location of the tumor, the pathological type, and the hydrocephalus status of the two molecular subgroups, the model performance was improved, achieving AUCs of 0.965 and 0.849 in the training and the test cohorts, respectively. Yan et al. ^[17][37] also constructed a classification model which was RF-based and yielded 11 optimal features out of the 5529 extracted to predict the molecular subgroups. This model yielded AUCs of 0.83, 0.67, 0.6, and 0.7 for WNT, SHH, Group 3, and Group 4, respectively, for the test cohort of 30 patients. Further, incorporating tumor location and hydrocephalus status into the radiomic model improved the AUCs for WNT and SHH subgroups to 0.84 and 0.83, respectively. Finally, adding age and gender information to the model further improved the AUCs to 0.91 and 0.87 for WNT and SHH subgroups, respectively, and the classification accuracies for Group 3 and Group 4 were 70% and 86.67%, respectively. Uniquely, Zhang et al. ^[18][38] developed a two-stage model that comprises a binary classifier in each step for WNT, SHH, and non-WNT and non-SHH classes. The first stage was used to distinguish WNT and SHH from Group 3/Group 4 subgroups, whereas the second stage was used to distinguish WNT from SHH. Six different classifiers, namely, SVM, logistic regression, k-nearest neighbor, RF, extreme gradient boosting, and neural network, were employed in each stage, and the overall performance was assessed for the combined stages. The final multiclass classifier was guided by maximizing the Dice Coefficient (DC), calculated as the weighted average between precision and recall. The combined, sequential classifier achieved a DC score of 88% and a binary score of 95%, specifically for the WNT subgroup. Additionally, a Group 3 versus Group 4 classifier achieved an AUC of 98%.

Other statistical tests were employed in the context of MB molecular subgroup identification. For instance, Iyer et al. ^[12][32] utilized a multiclass ANOVA test, followed by multiple comparison of means, to identify significant differences between the four subgroups based on the deformation heterogeneity features extracted from the neighboring structures to the tumor. Significant differences were observed between deformation magnitudes obtained for Group 3, Group 4, and SHH subgroups that occurred up to 60 mm outside the tumor edge. The skewness of deformation yielded a p-value of 0.028 for Group 3 vs. SHH and Group 4, and the median of deformation yielded a p-value of 0.05 for Group 3 vs. Group 4. Similarly, Dasgupta et al. ^[19][39] applied some statistical tests such as the Pearson chi-square test, Fisher’s exact test, and Cohen’s Kappa statistics to establish a correlation between the imaging features and molecular subgroups. Additionally, on the training cohort (N = 76), binary logistic regression was performed using different combinations of the significant MRI features to distinguish a certain molecular subgroup from the other three, and nomograms were constructed for the individual subgroups. The predictive accuracies for the subgroups were excellent for SHH (95%), acceptable for Group 4 (78%), but sub-optimal for Group 3 (56%) and WNT (41%) subgroups.