Deep Learning Techniques for Prediction of Alzheimer’s Disease

Deep Learning Techniques for Prediction of Alzheimer’s Disease: Comparison

Please note this is a comparison between Version 2 by Amina Yu and Version 1 by K Aditya Shastry.

Deep learning (DL) has become a prominent issue in the MLmachine learning (ML) domain in the past few years. ML can be utilized to tackle issues in different sectors. Neuroscience is included in this list. It is well known that detecting malignancies and functioning regions in cognitive systems has been a huge challenge for scientists over the years. The standard approach of detecting the variation in blood oxygen levels can be applied for this purpose. However, completing all the processes can take too long on certain occasions. One benefit of DL approaches over typical ML methods is that the reliability of DL techniques grows with the phases of learning. The efficiency of DL methods tends to rise greatly as more information is provided to them, and they outperform conventional techniques. This is similar to the human brain, which learns more as new information becomes available on a daily basis.

deep learning
health informatics
Alzheimer’s disease

1. Transformation from Machine L to DL earning (ML) to Deep Learning (DL) Approaches for the Effective Prediction of Alzheimer’s Disease (AD)

During the last decade, ML has been employed to discover neuroimaging indicators of AD. Several ML technologies are now being used to enhance the diagnosis and prognosis of AD [5]^[1]. The authors of [6]^[2] used a “support vector machine (SVM)” to accurately categorize steady Mild cognitive impairment (MCI) vs. progressing MCI in 35 occurrences of control subjects and 67 MCI instances. In most ML procedures for bio-image identification, slicing is prioritized, but recovery of robust shape features has mostly been ignored. In several circumstances, however, extracting convincing qualities from a feature space could eliminate the necessity for image classification [7]^[3]. Most early studies relied on traditional shape features such as “Gabor filters” and “Haralick texture” attributes [8,9]^[4][5]. DL is defined as a novel domain of ML research that was launched with the purpose of bringing ML nearer to its initial objective: “artificial intelligence (AI)”. To interpret textual, voice, and multimedia files, the DL architecture often requires more abstraction and representation levels [10]^[6].

The authors of [11]^[7] provide a comparative analysis of classical ML and DL techniques for the early diagnosis of AD and the development of mild cognitive impairment to Alzheimer’s disease. They examined sixteen techniques, four of which included both DL and ML, and twelve employed only DL. Using a combination of DL and ML, an accuracy rate of 96% was attained for feature selection and 84.2% for MCI-to-AD transformation. Utilizing CNN in the DL method, an attribute selection accuracy of 96.0% and a MCI-to-AD conversion predictive performance of 84.2% were obtained. In particular, ithe author was discovered that categorization ability could be enhanced by combining composite neuroimaging with serum biomarkers.

According to the study in [12]^[8], it is obvious that DL approaches for feature extraction and the ML strategy of classification using a SVM classifier are extremely effective for AD diagnosis and prediction. It has also been noted that prognosis and treatment based on many modalities fare better than those based on a single modality. Recent developments show a rise in the application of DL algorithms for the study of medical images, allowing for quicker interpretation and more improved precision than a human clinician. Figure 1 shows that DL could be placed into two groups: “generative architecture” and “discriminative architecture”.

Figure 1.

Types of DL architectures.

The “Recurrent Neural Network (RNN)”, “Deep Auto-Encoder (DAE)”, “Deep Boltzmann Machine (DBM)”, and “Deep Belief Networks (DBN)” are the four kinds of “generative architecture”, whereas the “Convolutional Neural Network (CNN)” and RNN are the two kinds of “discriminative architecture”. The structurally complex transformations and local derivative structures were recently discovered as current segmentation techniques for Phyto analytics by many scientists [11,12,13]^[7][8][9]. These descriptions are referred to as hand-crafted traits since they were created by people to extract characteristics from photos. A major aspect of employing these characteristics was to utilize vectors to locate a part of a picture, whereupon the created pattern is extracted. The SVM then receives the characteristics obtained by the customized approach [14]^[10] as a form of predictor. The best characteristics extract characteristics from a database. Several of the most widely used and concise descriptors rely on DL to achieve this [15,16]^[11][12]. As shown in Figure 2, the CNN is used to pull descriptions out of the images for this reason.

Figure 2.

The architecture of a generalized CNN.

CNNs are particularly good at retrieving general features [17]^[13]. Various layers of approximations are formed when a deep network has been built on a large volume of imagery. The first-layer characteristics, for example, are like “Gabor filters” or color objects, which can be used for a wide range of picture issues and repositories [18]^[14]. “Deep neural networks (DNN)” can be employed on bio-image records; however, this method necessitates a large volume of information that is difficult to come by in many circumstances [19]^[15]. The information augmentation procedure is an answer to this situation, as it could customize the preliminary data using its own approach, allowing it to build information. Reflection, translation, and pivoting original imageries to generate opposing portrayals are certain popular information augmentation processes [20]^[16]. Customizing the picture’s luminosity, intensity, as well as brightness could also produce diverse images [21,22]^[17][18]. “Principal component analysis (PCA)” is another commonly utilized technique for information augmentation. Certain essential elements are inserted into a PCA once they have been scaled down to a smaller proportion [23,24]^[19][20]. The major goal of this procedure is to display only the picture’s highly appropriate features. “Generative adversarial networks” have been used in recent studies [25,26]^[21][22] to combine images that vary with the primary ones. This strategy necessitates the creation of a separate domain [27,28]^[23][24].

The images generated, however, are not reliant on modifications in the image database. As a result, different techniques may be applied depending upon the issue. For instance, element-wise computation was used to mimic random noise in radar altimeter imagery in [29]^[25]. Ductility was used in [30]^[26] to mimic the process of stretching in prostate chemotherapeutics. An alternative technique that takes advantage of DL is to adjust a pre-trained DL model, such as a CNN, on fresh data reflecting a different challenge. This method takes advantage of a pre-trained CNN’s shallow depth layers. Fine-tuning (also known as “tuning”) is a technique for stretching the learning phase on a new image dataset. This strategy significantly decreases the computing expenses of learning new information and is suited for modest populations. Another advantage of fine-tuning is that it enables scientists to readily study CNN combinations because of lower processing expenses. Such configurations could be created with multiple pre-trained CNNs and a variety of hyperparameters.

CNNs are also used as attribute extractors in certain investigations [31]^[27]. Support vector machine (SVM) with quadratic or regular kernels plus “logistic regression” and “extreme ML random forest” or “XGBoost” and “decision trees” are used for classifications [32]^[28]. Shmulev et al. [33]^[29] evaluated the findings acquired via the CNN technique to those obtained through alternative classifiers that only analyzed characteristics derived by CNN and determined that the latter works better than the former. Rather than being deployed explicitly for visual information, CNNs could be utilized on pre-extracted characteristics. This is particularly pertinent whenever a CNN is administered to the outcomes of different regression methods and whenever diagnostic ratings are matched across other model parameters and magnetic resonance characteristics.

CNNs could also be used to analyze non-Euclidean environments such as clinical charts or cerebral interface pictures. Morphological MRIs could be used with different designs. Various perceptron variants, such as a “probabilistic neural network” or a “stacked of FC layers,” were used in various studies. Several studies used both “supervised” (deep polynomial networks) and “unsupervised” (deep Boltzmann machine and AE) designs to retrieve enhanced interpretations of attributes, whereas SVMs are primarily used for classification [34]^[30]. Imagery parameters such as texturing, forms, trabecular bone, and environment factors are subjected to considerable pre-processing, which is common in non-CNN designs. Furthermore, to further minimize the dimensions, the integration or extraction of attributes is commonly utilized. On the other hand, DL-based categorization techniques are really not limited to cross-sectional structural MRIs. Observational research could combine data from various time frames while researching relatively similar topics.

In [35]^[31], the authors developed an SVM with kernels that permitted antipsychotic MCI was developed to be switched to AD while the other premonitory categories of AD were removed. They were able to achieve a 90.5 percent cross-validation effectiveness in both the AD and NC studies. They were also 72.3 percent accurate in predicting the progression of MCI to AD. Regarding the extraction of attributes, two methods were utilized:

“Free Surfer” is an application for cerebral localization with cortex-associated information.
The “SPM5 (Statistical Parametric Mapping Tool)” is a device for the mapping of statistical parameters.

Researchers further found that characteristics ranging from 24 to 26 are the most accurate predictors of MCI advancing to AD. They also discovered that the width of the bilateral neocortex may be the most important indicator, followed by right hippocampus thickness and APOE E”4 state. Costafreda et al. [36]^[32] employed hippocampus size to identify MCI patients who were inclined to progress to AD. A number of 103 MCI patients from “AddNeuroMed” were used in their research. They employed the “FreeSurfer” for information pre-processing and SVM with a semi-Stochastic radial basis kernel for information categorization. Following model training on the entire AD and NC datasets, researchers put it into practice. In less than a year, they were able to achieve an accuracy of 85 percent for AD and 80 percent for NC. They concluded that hippocampus alterations could enhance predictive efficacy by consolidating forebrain degeneration.

According to a comprehensive analysis of various SVM-centered studies [37]^[33], SVM is a commonly used technique to differentiate between AD patients and apparently healthy patients, as well as between steady and progressing subtypes of MCI. Regarding diagnoses, advancement projections, and therapy outcomes, functional and structural neuroimaging approaches were applied. Eskildsen et al. [38]^[34] found five important ways to tell the difference between stable MCI and MCI that is becoming worse.

To differentiate and diagnose AD, the researchers in [39]^[35] studied 135+ AD subjects, 220+ CN patients, and 350+ MCI patients. They trained on the neuroimaging utilizing information from ADNI. To differentiate AD patients from CN patients, they employed “neural networks” and “logistic regression”. The metrics were determined to have extensive brain properties. Rather than relying on specific parts of the brain, important properties such as volume and thickness were determined.

Because of its capacity to gradually analyze multiple levels and properties of MRI and PET brain pictures, theit ^[36] wauthors of [40] s advised that using cascading CNNs in 2018. Since no picture segmentation was used in the pre-treatment of the information, no skill was necessary. This trait is widely seen as a benefit of this technique over others. The attributes were extracted and afterwards adapted to the framework in the other techniques. Depending on the ADNI dataset, their research included 90 plus NC and AD subjects, with 200 plus MCI cases. The efficiency rate was greater than 90%.

The work in [41]^[37] suggested a knowledge-picture recovery system that is based on “3D Capsules Networks (CapsNets)”, a “3D CNN”, and pre-treated 3D auto-encoder technologies to identify AD in its early phases. AccoHerding to the authorsein, 3D CapsNets are capable of quick scanning.

Unlike deep CNN, however, this strategy could only increase identification. TAD withe authors were a 98.42% accuracy was able to be distinguish AD withed. it a^[38] 98.42% accurwacy. The authors of [42] s looked at 407 normal participants, 418 AD patients, 280 progressing MCI patients, and 533 steady MCI instances from an institution. They practiced on 3D T1-weighted pictures using CNNs. The repository they used was ADNI. They looked at CNN operations to identify AD, progressing MCI, and stable MCI. Whenever CNNs were utilized to separate the progressing MCI individuals from the steady MCI patients, there was a 75% accuracy rate. The researchers in [43]^[39] developed an algorithm that used MRI scans to determine medical symptoms. The maximum number of cases that researchers could use was 2000 or more, and they chose to work on the ADNI repository.

“DSA-3DCNN” was reported to be quite accurate compared to alternative contemporary classifiers in diagnosing AD that relied on MRI scans by Hosseini-Asl et al. [44]^[40]. The authors demonstrated that distinguishing between AD, MCI, and NC situations can improve the retrieval of characteristics in 3D-CNN. With respect to analysis, the cerebral extraction technique used seven parameters. The FMRIB application package was utilized. This collection offers technologies to help MRI, fMRI, and DTI neuroimaging information, in addition to outlining the method of processing the information. By eliminating quasi-cerebral tissues from head MRIs, PET was utilized to categorize them into cerebral and non-cerebral imageries (a vital aspect of any assessment). In BET, no prior treatment was required, and the procedure was quick.

2. Diagnosis and Prognosis of AD Using DL Methods

DL is a subfield of ML [45]^[41] that discovers characteristics across a layered training process [46]^[42]. DL approaches for prediction and classification are being used in a variety of disciplines, such as object recognition [47,48,49]^[43][44][45] and computational linguistics [50^[46][47],51], which together show significant improvements over past methods [52,53,54]^[48][49][50]. Since DL approaches have been widely examined in the past few years [55^[51][52][53],56,57], this section concentrates on the fundamental ideas of “Artificial Neural Networks (ANNs)”, which underpin DL [58]^[54]. The DL architectural schemes used for AD classification and prognosis assessment are also discussed. NN is a network of connected processing elements that have been modeled and established using the “Perceptron”, the “Group Method of Data Handling” (GMDH), and the “Neocognitron” concepts. Because the single layer perceptron could only generate linearly separable sequences, these significant works investigated effective error functions and gradient computational algorithms. Furthermore, the back-propagation approach, which utilizes gradient descent to minimize the error function, was implemented [59]^[55].

After detection, a person with AD can expect to live for an average of 3 to 11 years. Certain individuals, nevertheless, may survive for 20 years or more after receiving a diagnosis. The prognosis typically relies on the patient’s age and how far the illness has advanced prior to detection. The sixth most frequent cause of mortality in the US is AD. Other ailments brought on by the problems of AD can be fatal. For instance, if a person with AD has trouble swallowing, they may suffer from dehydration, malnourishment, or respiratory infections if foods or fluids enter their lungs. The individuals responsible for the patient’s care are also directly and significantly impacted by AD in addition to the patients themselves. Caregiver stress condition refers to a deterioration in the psychological and/or physical well-being of the individual caring for the Alzheimer’s sufferer and is another persistent complication of AD in this regard.

Rapid progress in neuroimaging techniques has rendered the integration of massively high-dimensional, heterogeneous neuroimaging data essential. Consequently, there has been great interest in computer-aided ML techniques for the integrative analysis of neuroimaging data. The use of popular ML methods such as the Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and Decision Trees (DT), among others, promises early recognition and progressive forecasting of AD. Nevertheless, proper pre-processing processes are required prior to employing these methods. In addition, for classification and prediction, these steps involve attribute mining, attribute selection, dimensionality reduction, and feature-based classification. These methods require specialized knowledge as well as multiple time-consuming optimization phases [5]^[1]. Deep learning (DL), an emerging branch of machine learning research that uses raw neuroimaging data to build features through “on-the-fly” learning, is gaining significant interest in the field of large-scale, high-dimensional neuroimaging analysis as a means of overcoming these obstacles [59]^[55].

References

Al-Shoukry, S.; Rassem, T.H.; Makbol, N.M. Alzheimer’s Diseases Detection by Using Deep Learning Algorithms: A Mini-Review. IEEE Access 2020, 8, 77131–77141.
Haller, S.; Nguyen, D.; Rodriguez, C.; Emch, J.; Gold, G.; Bartsch, A.; Lovblad, K.O.; Giannakopoulos, P. Individual prediction of cognitive decline in mild cognitive impairment using support vector machine-based analysis of diffusion tensor imaging data. J. Alzheimer’s Dis. 2010, 22, 315–327.
Gamarra, M.; Mitre-Ortiz, A.; Escalante, H. Automatic cell image segmentation using genetic algorithms. In Proceedings of the 2019 XXII Symposium on Image, Signal Processing and Artificial Vision (STSIVA), Bucaramanga, Colombia, 24–26 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5.
Fogel, I.; Sagi, D. Gabor filters as texture discriminator. Biol. Cybern. 1989, 61, 103–113.
Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621.
Deng, L.; Yu, D. Deep Learning: Methods and Applications. Found. Trends Signal Processing 2014, 7, 197–387.
Nanni, L.; Brahnam, S.; Ghidoni, S.; Menegatti, E.; Barrier, T. A comparison of methods for extracting information from the co-occurrence matrix for subcellular classification. Expert Syst. Appl. 2013, 40, 7457–7467.
Xu, Y.; Zhu, J.Y.; Eric, I.; Chang, C.; Lai, M.; Tu, Z. Weakly supervised histopathology cancer image segmentation and classification. Med. Image Anal. 2014, 18, 591–604.
Barker, J.; Hoogi, A.; Depeursinge, A.; Rubin, D.L. Automated classification of brain tumor type in whole-slide digital pathology images using local representative tiles. Med. Image Anal. 2015, 30, 60–71.
Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000.
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117.
Feng, C.; Elazab, A.; Yang, P.; Wang, T.; Zhou, F.; Hu, H.; Xiao, X.; Lei, B. Deep learning framework for alzheimer’s disease diagnosis via 3d-cnn and fsbi-lstm. IEEE Access 2019, 7, 63605–63618.
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444.
Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? In Proceedings of the 27th International Conference on Neural Information Processing Systems, Ser. NIPS’14, Montreal, QC, Canada, 8–13 December 2014; MIT Press: Cambridge, MA, USA, 2014; Volume 2, pp. 3320–3328.
Sarraf, S.; Tofighi, G. Classification of alzheimer’s disease using fmri data and deep learning convolutional neural networks. arXiv 2016, arXiv:1603.08631.
Li, Y.; Huang, C.; Ding, L.; Li, Z.; Pan, Y.; Gao, X. Deep learning in bioinformatics: Introduction, application, and perspective in the big data era. Methods 2019, 166, 4–21.
Mussap, M.; Noto, A.; Cibecchini, F.; Fanos, V. The importance of biomarkers in neonatology. Semin. Fetal Neonatal Med. 2013, 18, 56–64.
Cedazo-Minguez, A.; Winblad, B. Biomarkers for Alzheimer’s disease and other forms of dementia: Clinical needs, limitations and future aspects. Exp. Gerontol. 2010, 45, 5–14.
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90.
Shijie, J.; Ping, W.; Peiyi, J.; Siping, H. Research on data augmentation for image classification based on convolution neural networks. In Proceedings of the 2017 Chinese Automation Congress (CAC), Jinan, China, 20–22 October 2017; pp. 4165–4170.
Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 2018, 321, 321–331.
Zhao, D.; Zhu, D.; Lu, J.; Luo, Y.; Zhang, G. Synthetic medical images using f&bgan for improved lung nodules classification by multi-scale vgg16. Symmetry 2018, 10, 519.
Long, M.; Cao, Y.; Cao, Z.; Wnag, J.; Jordan, M.I. Transferable Representation Learning with Deep Adaptation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 3071–3085.
Pichler, B.J.; Kolb, A.; Nägele, T.; Schlemmer, H.-P. PET/MRI: Paving the Way for the Next Generation of Clinical Multimodality Imaging Applications. J. Nucl. Med. 2010, 51, 333–336.
Ding, J.; Chen, B.; Liu, H.; Huang, M. Convolutional neural network with data augmentation for sar target recognition. IEEE Geosci. Remote Sens. Lett. 2016, 13, 364–368.
Castro, E.; Cardoso, J.S.; Pereira, J.C. Elastic deformations for data augmentation in breast cancer mass detection. In Proceedings of the 2018 IEEE EMBS International Conference on Biomedical Health Informatics (BHI), Las Vegas, NV, USA, 4–7 March 2018; pp. 230–234.
Nordberg, A.; Rinne, J.O.; Kadir, A.; Långström, B. The use of PET in Alzheimer disease. Nat. Rev. Neurol. 2010, 6, 78–87.
Shen, T.; Jiang, J.; Li, Y.; Wu, P.; Zuo, C.; Yan, Z. Decision supporting model for one-year conversion probability from mci to ad using cnn and svm. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018; pp. 738–741.
Shmulev, Y.; Belyaev, M. Predicting conversion of mild cognitive impairments to alzheimer’s disease and exploring impact of neuroimaging. In Graphs in Biomedical Image Analysis and Integrating Medical Imaging and Non-Imaging Modalities; Stoyanov, D., Taylor, Z., Ferrante, E., Dalca, A.V., Martel, A., Maier-Hein, L., Parisot, S., Sotiras, A., Papiez, B., Sabuncu, M.R., et al., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 83–91.
Suk, H.-I.; Lee, S.W.; Shen, D. Hierarchical feature representation and multimodal fusion with deep learning for ad/mci diagnosis. NeuroImage 2014, 101, 569–582.
Nho, K.; Shen, L.; Kim, S.; Risacher, S.L.; West, J.D.; Foroud, T.; Jack, C.R., Jr.; Weiner, M.W.; Saykin, A.J. Automatic prediction of conversion from mild cognitive impairment to probable alzheimer’s disease using structural magnetic resonance imaging. In Annual Symposium Proceedings/AMIA Symposium; American Medical Informatics Association: Bethesda, MD, USA, 2010; Volume 2010, pp. 542–546.
Costafreda, S.G.; Dinov, I.D.; Tu, Z.; Shi, Y.; Liu, C.Y.; Kloszewska, I.; Mecocci, P.; Soininen, H.; Tsolaki, M.; Vellas, B.; et al. Automated hippocampal shape analysis predicts the onset of dementia in mild cognitive impairment. NeuroImage 2011, 56, 212–219.
Saraiva, C.; Praça, C.; Ferreira, R.; Santos, T.; Ferreira, L.; Bernardino, L. Nanoparticle-mediated brain drug delivery: Overcoming blood–brain barrier to treat neurodegenerative diseases. J. Control. Release 2016, 235, 34–47.
Coupé, P.; Eskildsen, S.F.; Manjón, J.V.; Fonov, V.S.; Collins, D.L. Simultaneous segmentation and grading of anatomical structures for patient’s classification: Application to Alzheimer’s disease. Neuroimage 2011, 59, 3736–3747.
Wolz, R.; Julkunen, V.; Koikkalainen, J.; Niskanen, E.; Zhang, D.P.; Rueckert, D.; Soininen, H.; Lötjönen, J.; The Alzheimer’s Disease Neuroimaging Initiative. Multi-method analysis of MRI images in early diagnostics of Alzheimer’s disease. PLoS ONE 2011, 6, e25446.
Liu, M.; Cheng, D.; Wang, K.; Wang, Y.; The Alzheimer’s Disease Neuroimaging Initiative. Multi-modality cascaded convolutional neural networks for Alzheimer’s disease diagnosis. Neuroinformatics 2018, 16, 295–308.
Kruthika, K.R.; Maheshappa, H.D. Multistage classifier-based approach for Alzheimer’s disease prediction and retrieval. Inform. Med. Unlocked 2019, 14, 34–42.
Basaia, S.; Agosta, F.; Wagner, L.; Canu, E.; Magnani, G.; Santangelo, R.; Filippi, M. Automated classification of Alzheimer’s disease and mild cognitive impairment using a single MRI and deep neural networks. NeuroImage Clin. 2018, 21, 101645.
Payan, A.; Montana, G. Predicting Alzheimer’s disease: A neuroimaging study with 3d convolutional neural networks. In Proceedings of the ICPRAM 2015 4th International Conference on Pattern Recognition Applications and Methods, Lisbon, Portugal, 10–12 January 2015; Volume 2.
Asl, E.H.; Ghazal, M.; Mahmoud, A.; Aslantas, A.; Shalaby, A.; Casanova, M.; Barnes, G.; Gimel’farb, G.; Keynton, R.; El Baz, A. Alzheimer’s disease diagnostics by a 3d deeply supervised adaptable convolutional network. Front. Biosci. 2018, 23, 584–596.
Feldman, M.D. Positron Emission Tomography (PET) for the Evaluation of Alzheimer’s Disease/Dementia. In Proceedings of the California Technology Assessment Forum, New York, NY, USA, June 2010.
Bengio, Y. Learning deep architectures for AI. Found. Trends Mach. Learn. 2009, 2, 1–127.
Ciregan, D.; Meier, U.; Schmidhuber, J. Multi-column deep neural networks for image classification. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3642–3649.
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25; Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q., Eds.; ACM: Stateline, NV, USA, 2012; pp. 1097–1105.
Farabet, C.; Couprie, C.; Najman, L.; Lecun, Y. Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1915–1929.
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.-R.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.N.; et al. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 2012, 29, 82–97.
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26. Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q., Eds.; ACM: Stateline, NV, USA, 2013; pp. 3111–3119.
Jo, T.; Nho, K.; Saykin, A.J. Deep Learning in Alzheimer’s Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data. Front. Aging Neurosci. 2019, 11, 220.
Boureau, Y.-L.; Ponce, J.; Lecun, Y. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 111–118.
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comp. Vision 2015, 115, 211–252.
Bengio, Y. Deep learning of representations: Looking forward. In Proceedings of the International Conference on Statistical Language and Speech Processing, First International Conference, SLSP 2013, Tarragona, Spain, 29–31 July 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 1–37.
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828.
Yi, H.; Sun, S.; Duan, X.; Chen, Z. A study on Deep Neural Networks framework. In Proceedings of the 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Xi’an, China, 3–5 October 2016.
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507.
Werbos, P.J. Backwards differentiation in AD and neural nets: Past links and new opportunities. In Automatic Differentiation: Applications, Theory, and Implementations; Bücker, H.M., Corliss, G., Hovland, P., Naumann, U., Norris, B., Eds.; Springer: New York, NY, USA, 2006; pp. 15–34.