Deep Learning-Based Diagnosis of Alzheimer’s Disease: Comparison
Please note this is a comparison between Version 1 by Tausifa Jan Saleem and Version 2 by Vivi Li.

Alzheimer’s disease (AD), the most familiar type of dementia, is a severe concern in modern healthcare. Around 5.5 million people aged 65 and above have AD, and it is the sixth leading cause of mortality in the US. AD is an irreversible, degenerative brain disorder characterized by a loss of cognitive function and has no proven cure. Deep learning techniques have gained popularity in recent years, particularly in the domains of natural language processing and computer vision. Since 2014, these techniques have begun to achieve substantial consideration in AD diagnosis research, and the number of papers published in this arena is rising drastically. Deep learning techniques have been reported to be more accurate for AD diagnosis in comparison to conventional machine learning models. 

  • Alzheimer’s disease
  • deep learning
  • biomarkers
  • positron emission tomography
  • Magnetic Resonance Imaging
  • mild cognitive impairment

1. Introduction

Alzheimer’s disease (AD) is the most widespread neurodegenerative disease, with a prefatory Mild Cognitive Impairment (MCI) stage in which memory loss is the primary symptom, which gradually worsens with conduct problems and deprived self-care [1]. However, not everyone identified as having an MCI goes on to develop AD [2]. A small percentage of people with MCI develop non-AD dementia or stay stable in the MCI stage without advancing to dementia [2]. Even though there is no cure for AD, it is vital to correctly recognize those in the MCI phase who will develop AD. Simultaneously, it would be ideal to correctly identify people in the MCI stage who do not advance to AD so that they are saved from unneeded pharmacologic therapies that at best may give little help and, at worst, may harm them more with side effects. As a result, much work has gone into developing early detection tools, particularly at pre-symptomatic phases, in an attempt to reduce or thwart disease progression. Advanced neuroimaging strategies, such as Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET), have been employed to uncover the structural and molecular biomarkers pertaining to AD [3].
Brisk advancement in neuroimaging strategies has made the integration of large-scale and high-dimensional multi-modal neuroimaging data very crucial [4]. As a result, interest in computer-assisted machine learning methodologies for integrative analysis of neuroimaging data has attracted a lot of attention. Well-known machine learning approaches such as Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Decision Trees (DT), etc., have been employed and promise early diagnosis and prediction of AD progression. However, appropriate pre-processing steps must be applied before using such approaches. Moreover, these approaches require feature extraction, feature selection, dimensionality reduction, and feature-based classification for classification and prediction. These steps necessitate specialist knowledge as well as several optimization stages, which are time-intensive [5]. To overcome these hurdles, deep learning (DL), a looming domain of machine learning research that employs raw neuroimaging data to produce features through “on-the-fly” learning, is garnering substantial attention in the field of large-scale, high-dimensional neuroimaging analysis.

2. DL for AD Diagnosis

Figure 1 3 presents a framework for classification of the AD using DL. The AD dataset is pre-processed first using pre-processing techniques such as skull stripping, spatial normalization, smoothing, grayscale normalization, slicing and resizing. Skull stripping is used to segregate non-brain tissues from brain tissues. Spatial normalization normalizes images from diverse subjects to a common template. Smoothing improves the quality of the images by removing noise from the images. Grayscale normalization maps the pixel intensity values to a new and more suitable range. Slicing divides the image into multiple logical images. Finally, resizing is carried out in order to get the desired image size. Then the pre-processed data are fed as input to the DL model that performs feature extraction and classification of the input data. Finally, the model is evaluated using performance metrics such as accuracy, F1 score, area under curve (AUC), and mean squared error (MSE). The following presents a thorough literature review of DL techniques for AD diagnosis. Table 1 presents a summary of these research works.
Figure 13.
 DL-based AD classification framework.

2.1. Feed-Forward DNN for AD Diagnosis

Feed-forward DNN has been utilized by multiple studies for AD diagnosis. Amoroso et al. [6][31] proposed a method based on Random Forest (RF) and DNN for revealing the onset of Alzheimer’s in subjects with MCI. RF was used for feature selection, and DNN performed the classification of input. The RF consisted of 500 trees and performed 100 rounds, and in each round, 20 crucial features were chosen. DNN consisted of 11 layers with 2056 input units and four output units. ReLU and tanh were used as the activation functions, and categorical cross-entropy was used as the loss function. Adam was used as the optimizer in the DNN. The authors compared the proposed approach with SVM and RF, and it was shown that the proposed method outperforms these techniques. Kim and Kim [7][32] proposed a DNN-based model for the diagnosis of Alzheimer’s in its early stage. The model takes the EEG of the subjects as input and classifies it into two groups, MCI and HC (healthy controls). The authors compared the proposed approach with a shallow neural network, and it was demonstrated that the proposed model outperforms a shallow neural network. Rouzannezhad et al. [8][33] formulated a technique based on DNN for binary (MCI, CN) and multiclass (EMCI, LMCI, AD, CN) classification of subjects in order to detect AD in the premature stage. The authors fed multimodal data (MRI, PET and typical neurophysiological parameters) as input to the DNN. The DNN consisted of three hidden layers, and Adam was used as the optimizer. Moreover, dropout was used to avoid the over-fitting problem. Experiments carried out in the research work demonstrated that the proposed technique performs better than the single modal scenarios in which only MRI or PET was fed as input to the DNN model. Moreover, the fusion of typical neurophysiological data with MRI and PET further enhanced the efficiency of the approach. Fruehwirt et al. [9][34] formulated a model based on Bayesian DNN that predicts the severity of AD disease using EEG data. The proposed model consisted of two layers with 100 units each. The authors demonstrated that the proposed model is a good fit for predicting disease severity in clinical neuroscience. Orimaye et al. [10][35] proposed a hybrid model consisting of DNN and deep language models (D2NNLM) to predict AD. Experiments conducted in the study demonstrated that the proposed model predicts the conversion of MCI to AD with high accuracy. Ning et al. [11][36] formulated a neural network-based model for the classification of subjects into AD and CN categories. Moreover, the model predicts the conversion of MCI subjects to AD. MRI and genetic data were fed as input to the model. The authors compared the proposed model with logistic regression (LR), and it was demonstrated that the proposed model outperforms the LR model. Park et al. [12][37] proposed a model based on DNN that takes as input the integrated gene expressions and DNA methylation data and predicts the progression of AD. The authors demonstrated that the integrated data results in better model accuracy as compared to single-modal data. Moreover, the proposed model outperformed existing machine learning models. The authors used the Bayesian method to choose optimal parameters for the model. It was shown that a DNN with eight hidden layers, 306 nodes in each layer, the learning rate of 0.02, and a dropout rate of 0.85 attains the best performance. Benyoussef et al. [13][38] proposed a hybrid model consisting of KNN (K-Nearest Neighbor) and DNN for the classification of subjects into No-Dementia (ND), MCI and AD based on MRI data. In the proposed model, KNN assisted DNN in discriminating subjects that are easily diagnosable from hard to diagnose subjects. The DNN consisted of two hidden layers with 100 nodes each. Experimental results demonstrated that the proposed model successfully classified the different AD stages. Manzak et al. [14][39] formulated a model based on DNN for the detection of AD in the early stage. RF was used for feature extraction in the proposed model. Albright [15][40] predicted the progression of AD using DNN in both cases, i.e., the subjects who were CN initially and later got AD and subjects who were having MCI and converted to AD. Suresha and Parthasarathy [16][41] proposed a model based on DNN with the rectified Adam optimizer for the detection of AD. The authors utilized the Histogram of Oriented Gradients (HOG) to extract crucial features from the MRI scans. It was shown with the help of experiments that the proposed model outperformed the existing strategies by a good margin. Wang et al. [17][42] utilized gene expression data for studying the molecular changes caused due to AD. The study used a DNN model for identifying the crucial molecular networks that are responsible for AD detection.

2.2. CNN for AD Diagnosis

The following studies utilized CNN for AD diagnosis. Suk and Shen [18][43] proposed a hybrid model based on Sparse Regression Networks and CNN for AD diagnosis. The model employed multiple Sparse Regression Networks for generating multiple target-level representations. These target-level representations were then integrated by CNN that optimally identified the output label. Billones et al. [19][44] altered the 16-layered VGGNet for classifying the subjects into three categories, AD, MCI and HC, based on structural MRI scans. Experiments conducted in the study demonstrated that the authors successfully performed classifications with good accuracy. The authors claimed that this was achieved without performing segmentation of the MR images. Sarraf and Tofighi [20][45] utilized LeNet architecture for the classification of the AD subjects from healthy ones based on functional MRI. The authors concluded that due to the shift-invariant and scale-invariant properties, CNN has got a massive scope in medical imaging. In another study, Sarraf and Tofighi [21][46] utilized LeNet architecture for classification of AD subjects from healthy ones based on structural MR images. The study attained an accuracy of 98.84%. In one more study, Sarraf and Tofighi [22][47] utilized LeNet and GoogleNet architectures for AD diagnosis based on Functional as well as structural MR images. Experiments conducted in the study demonstrated that these architectures performed better than state-of-the-art AD diagnosis techniques. Gunawardena et al. [23][48] formulated a method based on CNN for the diagnosis of AD in its early stage using structural MRI. The study compared the performance of the proposed method with SVM, and it was shown that the CNN model outperformed the SVM. The authors intend to incorporate two more MRI views (axial view and sagittal view) in addition to the coronal view used in this study in future. Basaia et al. [24][49] developed a model based on CNN for the diagnosis of AD using structural MR images. The study implemented data augmentation and transfer learning techniques for avoiding the over-fitting problem and improving the computational efficiency of the model. The authors claimed that the study overcomes limitations of the existing studies that usually focused on single-center datasets, which limits their usage. Wang et al. [25][50] designed an eight-layered CNN model for AD diagnosis. The authors compared three different activation functions, namely rectified linear unit (ReLU), sigmoid, and leaky ReLU and three different pooling functions, namely stochastic pooling, max pooling, and average pooling, for finding out the best model configuration. It was shown that the CNN model with leaky ReLU activation function and max pooling function gave the best results. Karasawa et al. [26][51] proposed a 3D-CNN based model for AD diagnosis using MR images. The architecture of proposed 3D-CNN is based on ResNet. It has 36 convolutional layers, a dropout layer, a pooling layer and a fully connected layer. Experiments conducted in the study demonstrated that the model outperformed several existing benchmarks. Tang et al. [27][52] proposed an AD diagnosis model based on 3D Fine-tuning Convolutional Neural Network (3D-FCNN) using MR images. The authors demonstrated that the proposed model outperformed several existing benchmarks in terms of accuracy and robustness. Moreover, the authors compared the 3D-FCNN model with 2D-CNN and it was shown that the proposed model performed better than 2D-CNN in binary as well as multi-class classification. Spasov et al. [28][53] proposed a multi-modal framework based on CNN for AD diagnosis using structural MRI, genetic measures and clinical assessment. The devised framework had much fewer parameters as compared to the other CNN models such as VGGNet, AlexNet, etc. This made the framework faster and less susceptible to problems such as over-fitting in case of scarce-data scenarios. Wang et al. [29][54] proposed a CNN based model for AD diagnosis using two crucial MRI modalities, namely fMRI and Diffusion Tensor Imaging (DTI). The model classified the subjects into three categories: AD, amnestic MCI and normal controls (NC). The authors proved that the proposed model performed better on multi-modal MRI than individual fMRI and DTI. Islam and Zhang [30][55] proposed a CNN-based model for AD diagnosis in the early stage using MR images. The authors trained the model using OASIS dataset, which is an imbalanced dataset. They used data augmentation to handle the imbalanced nature of the OASIS dataset. Experimental results demonstrated that the proposed model performed better than several state-of-the-art models. The authors plan to apply the proposed model to other AD datasets as well in future. Yue et al. [31][56] proposed a CNN-based model for AD diagnosis using structural MR images. The model classified the subjects into four categories: AD, EMCI, LMCI and NC. Experiments carried out in the research work demonstrated that the proposed model outperformed several benchmarks. Jian et al. [32][57] proposed a transfer learning-based approach for AD diagnosis using structural MRI. VGGNet16 trained on the ImageNet dataset was used as a feature extractor for AD classification. The proposed approach successfully classified the input into three different categories: AD, MCI and CN. Huang et al. [33][58] designed a multi-modal model based on 3D-VGG16 for the diagnosis of AD using MRI and FDG-PET modalities. The study demonstrated that the model does not require segmentation of the input. Moreover, the authors showed that the hippocampus of the brain is a crucial Region of Interest (ROI) for AD diagnosis. The authors intend to include other modalities as well in future. Goceri [34][59] proposed an approach based on 3D-CNN for AD diagnosis using MR Images. The proposed approach used Sobolev gradient as the optimizer, leaky ReLU as the activation function, and Max Pooling as the pooling function. The research work demonstrated that the combination of optimizer, activation function and pooling function implemented outperformed all the other combinations. Zhang et al. [35][60] utilized two independent CNNs for analyzing MR images and PET images separately. Then, correlation analysis of the outputs of the CNNs was performed to obtain the auxiliary diagnosis of AD. Finally, the auxiliary diagnosis result was combined with the clinical psychological diagnosis to obtain a comprehensive diagnostic output. The authors demonstrated that the proposed architecture is easy to implement and generates results closer to the clinical diagnosis. Basheera and Ram [36][61] proposed a model based on CNN for AD diagnosis using MR images. The MR images were divided into voxels first. Gaussian filter was used to enhance the quality of voxels and a skull stripping algorithm was used to filter out irrelevant portions from the voxels. Independent component analysis was applied to segment the brain into different regions. Finally, segmented gray matter was fed as input to the proposed model. Experimental results demonstrated that the proposed model outperformed several state-of-the-art models. Spasov et al. [37][62] proposed a parameter-efficient CNN model for predicting the MCI to AD conversion using structural MRI, demographic data, neuropsychological data, and APOe4 genetic data. Experiments carried out in the research work demonstrated that the proposed model performed better than several existing benchmarks. Ahmad and Pothuganti [38][63] performed a comparative analysis of SVM, Regional CNN (RCNN) and Fast Regional CNN for AD diagnosis. The study demonstrated that the Fast RCNN outperformed the other techniques. Lopez-Martin et al. [39][64] proposed a randomized 2D-CNN model for AD diagnosis in the early stage using MEG data. The research work demonstrated that the proposed model outperformed the classic machine learning techniques in AD diagnosis. Jiang et al. [40][65] proposed an eight-layered CNN model for AD diagnosis. The proposed model implemented batch normalization, data augmentation and drop-out regularization for achieving high accuracy. The authors compared the proposed model with several existing techniques, and it was demonstrated that the proposed model outperformed them. Nawaz et al. [41][66] proposed a 2D-CNN based model for AD diagnosis using MRI data. The proposed model classified the input images into three groups: AD, MCI and NC. The authors compared the proposed model with AlexNet and VGGNet architectures, and it was demonstrated that the proposed model outperformed these architectures. Bae et al. [42][67] modified the Inception-v4 model pre-trained on ImageNet dataset for AD classification using MRI data. The study used datasets from subjects with two different ethnicities. The study demonstrated that the model has the potential to be used as a fast and accurate AD diagnostic tool. Jo et al. [43][68] proposed a model based on CNN for finding the correlation between tau deposition in the brain and probability of having AD. The study also identified the regions in the brain that are crucial for AD classification. According to the study, these regions include hippocampus, para-hippocampus, thalamus and fusiform.

2.3. AE for AD Diagnosis

The following studies utilized AE for AD diagnosis. Lu et al. [44][82] proposed a SAE-based model for predicting the progression of AD. The proposed model was named Multi-scale and Multi-modal Deep Neural Network (MMDNN) as it integrated information from multiple areas of the brain scanned using MRI and FDG-PET. Experiments carried out in the research work demonstrated that analyzing both MRI and FDG-PET gives better results than the single modal settings. Liu et al. [45][83] designed a SAE-based model for the diagnosis of AD in its early stage. The authors demonstrated that the designed model performed well even in the case of limited training data. Moreover, the authors analyzed the performance of the model against Single-Kernel SVM and Multi-Kernel SVM, and it was revealed that the proposed model outperformed these models. Lu et al. [46][84] proposed a DL model based on SAE for discriminating pre-symptomatic AD and non-progressive AD in subjects with MCI using metabolic features captured with FDG-PET. The parameters in the model were initialized using greedy layer-wise pre-training. Softmax-layer was added for performing the classification. The proposed model was compared with the existing benchmark techniques that utilized FDG-PET for capturing the metabolic features, and it was shown that it performed better than those techniques.

2.4. RNN for AD Diagnosis

Lee et al. [47][85] proposed a RNN-based model that extracted temporal features from multi-modal data for forecasting the conversion of MCI subjects to AD patients. The data were fused between different modalities, including demographic information, MRI, CSF biomarkers and cognitive performance. The authors proved that the model outperformed the existing benchmarks. Furthermore, it was shown that the multi-modal model outperformed the individual single-modal models.

2.5. DBN for AD Diagnosis

Ortiz et al. [48][86] proposed two methods based on DBN for the early diagnosis of AD. These methods worked on fused functional and structural MRI scans. The first one, named as DBN-voter, consisted of an ensemble of DBN classifiers and a voter. Four different voting schemes were analyzed in the study, namely majority voting, weighted voting, classifiers fusion using SVM, and classifiers fusion using DBN. As the second model, FEDBN-SVM used DBNs as feature extractors and carried out classification using SVM. It was demonstrated that FEDBN-SVM outperformed DBN-voter in addition to the existing benchmarks, and in the case of DBN-voter, DBNs with classifiers fusion using SVM performed better.

2.6. GAN for AD Diagnosis

Ma et al. [49][87] proposed a GAN-based model for the differential diagnosis of frontotemporal dementia and AD pathology. The model extracted multiple features from MR images for classification. Moreover, data augmentation was performed in order to avoid over-fitting caused due to limited data problem. Experimental analysis carried out in the research work revealed that the model showed promising results in the differential diagnosis of frontotemporal dementia and AD pathology. The authors claimed that the proposed model could be used for the differential diagnosis in other neurodegenerative diseases as well.

2.7. Hybrid DL Models for AD Diagnosis

The following studies utilized hybrid DL models for AD diagnosis. Zhang et al. [50][88] proposed 3D Explainable Residual Self-Attention Convolutional Neural Network (3D ResAttNet) for diagnosis of AD using structural MR images. The proposed model is a CNN with a self-attention residual mechanism, and explainable gradient-based localization class activation mapping was employed that provided visual analysis of AD predictions. The self-attention mechanism modeled the long-term dependencies in the input and the residual mechanism dealt with the vanishing gradient problem. The authors compared the proposed model with 3D-VGGNet and 3D-ResNet, and it was shown that the proposed model performed better than these models. Payan and Montana [51][89] formulated a model based on 3D-CNN for the prediction of AD using MR images. The study employs Sparse AE for pre-training the convolutional filters. Experiments conducted in the research work revealed that the model outperformed the existing benchmarks. Hosseini et al. [52][90] proposed a hybrid model consisting of AE and 3D-CNN for early stage diagnosis of AD. The variations in anatomical shapes of brain images were captured by AE, and classification was carried out using 3D-CNN. The authors compared the proposed model with the existing benchmarks, and it was established that the proposed model outperformed those techniques. Moreover, the authors plan to apply the proposed model for the diagnosis of other conditions such as autism, heart failure and lung cancer. Vu et al. [53][91] proposed an AD detection system based on High-Level Layer Concatenation Auto-Encoder (HiLCAE) and 3D-VGG16. HiLCAE was used as a pre-trained network for initializing the weights of 3D-VGG16. The proposed system worked on the fused MR and PET images. Experiments carried out in the research work demonstrated that the proposed system detected AD with good accuracy. The authors intend to develop deeper networks for both HiLCAE and VGG16 in future so as to improve the accuracy further. Warnita et al. [54][92] proposed a gated CNN-based approach for AD diagnosis using speech transcripts. The proposed approach captured temporal features from speech data and performed classification based on the extracted features. The authors plan to apply the proposed approach to different languages in the future. Feng et al. [55][93] proposed a hybrid model consisting of Stacked Bidirectional RNN (SBi-RNN) and two 3D-CNNs for diagnosis of AD in its early stage. CNNs extracted preliminary features from MRI and PET images, while SBi-RNN abstracted discriminative features from the cascaded output of CNNs. The output from SBi RNN was fed to a softmax classifier that generated the model output. Experiments conducted in the study demonstrated that the proposed model outperformed state-of-the-art models. Li and Liu [56][94] proposed a framework consisting of Bidirectional Gated Recurrent Unit (BGRU) and DenseNets for hippocampus analysis-based AD diagnosis. The DenseNets were trained to capture the shape and intensity of MR images and BGRU abstracted high-level features between the right and left hippocampus. Finally, a fully connected layer performed classification based on the extracted features. Experiments conducted in the study revealed that the proposed framework generated promising results. Oh et al. [57][95] proposed a model based on end-to-end learning using CNN for carrying out the following classifications: AD versus NC, pMCI (probable MCI) versus NC, sMCI (stable MCI) versus NC, and pMCI versus sMCI. The authors utilized Convolutional Auto-Encoder for performing AD versus NC classification, and transfer learning was implemented to perform pMCI versus sMCI classification. Experiments carried out in the study showed that the proposed model worked better than several existing benchmarks. Chien et al. [58][96] developed a system for assessing the risk of AD based on speech transcripts. The system consisted of three components: a data collection component that fetched data from the subject, a feature sequence generator that converted the speech transcripts into the features, and an AD assessment engine that determined whether the person had AD or not. The feature sequence generator was built using a deep convolutional RNN, and the AD assessment engine was realized using a bidirectional RNN with the gated recurrent unit. Experimental analysis carried out in the research work revealed that the system gives promising results. Kruthika et al. [59][97] proposed a hybrid model consisting of 3D Sparse AE, CNN and capsule network for detection of AD in its early stage. The authors revealed that the hybrid model worked better than the 3D-CNN. Basher et al. [60][98] proposed an amalgam of Hough CNN, Discrete Volume Estimation-CNN (DVE-CNN) and DNN for AD diagnosis using structural MR images. Hough CNN has been used to localize right and left hippocampi. DVE-CNN was utilized to mine volumetric features from the pre-processed 2D patches. Finally, DNN classified the input based on the features extracted using DVE-CNN. The study demonstrated that the proposed approach outperformed the existing benchmarks by a good margin. Roshanzamir et al. [61][99] utilized a bidirectional encoder with logistic regression for early prediction of AD using speech transcripts. The authors implemented the concept of data augmentation for dealing with the limited dataset problem. Experiments conducted in the study demonstrated that the bidirectional encoder with logistic regression outperformed the existing benchmarks. Zhang et al. [62][100] proposed a densely connected CNN with attention mechanism for AD diagnosis using structural MR images. The densely connected CNN extracted multiple features from the input data, and the attention mechanism fused the features from different layers to transform them into complex features based on which final classification was performed. It was established that the model outperformed several existing benchmark models.
Video Production Service