In 2020, it was estimated that 308,102 people were diagnosed with a primary brain or spinal cord tumor in the world
[1]. Brain tumors are the 10th leading cause of death worldwide
[2]. It is caused by tissue abnormality that develops within the brain or the central spine. As a result, it disrupts the proper operation of the brain. The causes of brain tumors are unknown; nevertheless, the risk can be enhanced by exposure to radiation and family history
[3]. Consequently, detection and identification of brain tumors at an early phase is key to successful treatment. Indeed, it plays an indispensable role in improving treatment and ensuring a higher gain of survival possibility. There are several medical imaging techniques and diagnostic methods used to acquire information about tumors, such as Computed Tomography (CT) scans and Magnetic Resonance Imaging (MRI) scans that can distinguish between normal and abnormal cells that grow in the brain
[4]. The medical science field has, in the past few years, seen striking progress leading to accurate classification of brain tumors thanks to AI and deep learning. CNN is used in image processing techniques to segment, identify, and classify MRI images as well as to classify and detect brain tumors. These image processing techniques can be based on the image content analysis described in
[5][6][7], which plays a dynamic role in various computer vision applications. Recent advances in AI, and in particular in machine learning and deep learning, have contributed to the development of autonomous objects, such as robots, drones, and cars. This has allowed it to become the most important innovation driving force in the fields of technology and industry. The last few years have been marked by the growing interest in the healthcare sector and diseases detection to enhance the implementations of E-Health services. Deep Learning has recently become an active field of interest that attracts researchers, mostly in the field of medical sciences. It has significantly impacted the study of diseases in numerous ways: in the detection, prediction, and diagnosis of diseases. In
[8][9], the author’s proposed solutions and new techniques to impact image reconstruction and recognition performance. Computer science scientists have developed many deep learning algorithms to detect and diagnose diseases such as cancer, lung diseases, diabetes, heart diseases, Alzheimer’s disease, hepatitis, liver disease, among others. The attentiveness to deep learning is raised to convolutional neural networks (CNN), a powerful way to learn useful representations mainly of images and other structured data. Convolutional neural networks (CNN) are deep artificial neural networks majorly used in image classification, image segmentation, and objection detection. CNN has shown significant advantages in image recognition
[10][11]. Currently, it is attracting interest in a variety of domains and has achieved a huge advancement in various fields. Recently, new technologies have also taken an interest in other medical fields, such as neurosurgery. In
[12][13], authors showed that Augmented Reality (AR) and mobile devices could help in the operating room. In
[14], authors developed a new approach based on deep learning techniques to classify White Blood Cells for disease diagnosing. Experimental results showed that the classification of the modified images is more significant than the classification of the original ones. Authors, in
[15][16], proposed to identify and classify liver diseases by using a deep supervised learning method based on CNN architecture. A classification framework was proposed in
[15] and consists of improving the processing images and a segmentation of the liver lesions. In
[16], the authors developed a two-step classification approach. The first step is the collection of a sufficient number of isolated training samples. The second step is to train two CNN with the same architecture but employing different optimization algorithms. The architectures described in
[15][16] have reached a classification accuracy of 95%. Recently, with the COVID-19 pandemic, the world is facing a virus with unknown behavior. Therefore, several studies have been initiated to detect people attacked by this virus
[17]. In
[18], the author introduced a study to identify the presence or absence of malaria parasites in the blood smears of people by using a deep learning algorithm. The Convolutional Neural Network algorithm has successfully achieved an accuracy rate of 96%. As for Ghulam
[19], he suggested a study based on deep learning to develop an accurate classification model to classify Breast Cancer into eight subtypes. In
[20], authors stated a deep learning survey for detecting lung disease.
2. Related Works
In
[21][22], the authors provided an overview of some potential clinical use cases using deep learning techniques by defining the steps to undertake a deep learning project in radiology. The main idea of these two papers is to discuss opportunities and challenges for incorporating deep learning in the radiology practice of the future. The effectiveness of existing applications in radiology are not yet encouraging to say that the DL can replace a radiologist in all of his diagnostic work. However, radiologists and DL can help each other to give better results. Hence, several works have been done on the classification and segmentation of the brain using MRI images. El Abbadi et al. proposed a new method using SVD as a classifier to classify brain tumors. At the first level, the algorithm had been trained with normal brain MR images. Then, at the second level, it became capable of classifying the brain images into healthy and non-healthy images. The accuracy of this method reached up to 97%. In
[23], Sheikh Basheera et al. focused on brain tumor classification in MRI images using a classifier based on Convolutional Neural Networks (CNN). The main idea of the proposed approach is based on two steps. The first one is the tumor region segmentation using an ICA mixture mode model (Independent Component Analysis). The second step is the extraction of deep features. In
[24], Muhammad Sajjad et al. proposed a novel convolutional neural network (CNN) based multi-grade brain tumor classification system. The first step consists of segmenting the tumor regions from an MR image using a deep learning technique. After that, they employed extensive data augmentation to train the system effectively. Finally, a pre-trained VGG-19 CNN model is fine-tuned using augmented data for brain tumor grade classification. Sunanda Das et al.
[25] trained a CNN model with an image processing technique to identify various brain tumor types and achieved 94.39% accuracy with an average precision of 93.33%. In
[26], Muhammed Talo et al. used deep transfer learning to classify normal and abnormal brain MR images automatically. The proposed model that used ResNet34 has achieved a 5-fold classification accuracy of 100% on 613 MR images. Ahmet Inner et al., in
[27], used the ResNet50 pre-trained model, and they removed the last 5 layers of the model, then they added 8 new layers. Then, comparing its accuracy with other pre-trained models such as GoogleNet, AlexNet, and ResNet50. The modified ResNet50 model showed effective results by achieving 97.2% accuracy. He obtained a 90% accuracy in the classified images as normal and abnormal in his proposed machine learning method. The authors in
[28], proposed a modified AlexNet for the detection and classification of brain tumor images and obtained 91.6% of average classification accuracy. Another approach based on a modified ResNet50 model for brain tumor detection was developed in
[29]. The proposed architecture is based on the ResNet50 model with a modified layer model including five convolutional layers and three fully connected layers. In
[30], researchers proposed a brain tumor detection and classification. The main idea of their approach is to use a biologically inspired orthogonal wavelet transform and deep learning techniques. Techniques of graph theory were used
[31] to detect abnormalities in brains. A VGG16 architecture was the main model to classify brain images in
[32].
Limited datasets are a particularly common challenge in medical image analysis. Most computer vision tasks could use more data and data augmentation is one of the techniques often used to enhance the performance of computer vision systems. To overcome this limitation, many approaches based on deep learning have been proposed and detailed in the literature. One of the first applications of data augmentation was proposed in LeNet-5
[33] to classify the handwritten digit. In 2012, Krizhevsky et al.
[34] boosted image classification by the data augmentation techniques on the ImageNet dataset. The goal of the proposed approach is to increase the dataset size. The authors used in their experiments random cropping patches from the original images, flipping them horizontally, and changing the pixel intensity. Experimental results showed that the data augmentation reduced the error rate by over 1%. After the appearance of several research works using different data augmentation techniques, researchers can categorize them into two main categories
[35]. (1): Traditional transformations, which are based on the combination of the affine image transformation and color modification. (2): Generative Adversarial Networks (GANs), a tool based on an unsupervised generation of new images using min-max strategy
[36]. GANs were introduced in 2014 in
[37] and it consists of generating a new dataset. The new dataset is indistinguishable from the original one. In
[38], authors combined data augmentation with min-max normalization to increase the contrast of tumor cells.
3. A Taxonomy of Deep Convolutional Neural Networks
3.1. LeNet
The LeNet model is a classic CNN model proposed by Yann LeCun et al.
[39]. It has a wide range of applications in image classification
[40][41][42]. The LeNet-5 usually uses the ReLU function or the Sigmoid function as an activation function. It consists of an input layer, two convolutional layers, two pooling layers, two fully connected layers, and an output layer.
3.2. AlexNet
This architecture was developed by Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton, and it is considered the first convolutional network to popularize it in the field of computer vision
[34]. The AlexNet architecture consists of five convolutional layers (conv), three pooling layers (Pool) which are followed by three fully connected layers (FC). Compared to LeNet, this network is much bigger and deeper.
3.3. GoogleNet
In 2015, Google released GoogleNet, a deep neural network, which is a convolutional neural network that is 22 layers deep. Parallelization was introduced in this architecture. Indeed, it is characterized by an inception block that comprises a 1 × 1, 3 × 3, and a 5 × 5 convolution filter in addition to a 3 × 3 max-pooling layer
[43].
3.4. ResNet
He et al. initialized ResNet models that rely on deep architectures that have demonstrated convincing precision and convergence behaviors of high quality. ResNet was conceived through numerous stacked residual units and evolved using different numbers of layers: 18, 34, 50, 101, 152, and 1202. The main disadvantage of this network is that it is very expensive to evaluate due to a large number of parameters
[44].
3.5. VGGNet
VGGNet is an abbreviation of Visual Geometry Group; it is a convolutional neural network architecture proposed by Karen Simonyan and Andrew Zisserman of the University of Oxford in 2014
[45]. Its main contribution was to show that the depth of the network is a critical component to achieve better recognition or classification accuracy in CNNs.
3.6. DenseNet
In 2017, Huang et al. developed DenseNet
[46]. DenseNet uses dense connections between layers via dense blocks
[47][48][49][50]. DenseNet basically connects every layer to every other layer. This is extremely powerful. The entry of a layer in DenseNet is the concatenation of feature maps from previous layers. By connecting in this way, DenseNet requires fewer parameters than an equivalent traditional CNN, as there is no need to learn redundant feature maps.
3.7. SqueezeNet
SqueezeNet was designed as a more compact replacement for AlexNet. It is a smaller network that has almost 50 times fewer parameters than AlexNet, but it runs 3 times faster
[51]. To reduce the size of the model, SqueezeNet was designed with three strategies:
-
Reduction of the filter size with the use of 1 × 1 filter instead of 3 × 3.
-
Reduction of the input channels to 3 × 3 filters.
-
Downsampling at the end of the array so that the convolutional layers have large activation maps.
3.8. MobileNet
MobileNet is an architecture of CNN. It is efficient for mobile and embedded vision systems
[52]. Its model is designed to be used in mobile applications and it is the first mobile computer vision model based on TensorFlow. In MobileNet, the convolution is replaced by a “Depthwise Separable Convolution” which is carried out in two stages:
The Depthwise Convolution applies a filter to each channel, unlike conventional convolution, which applies a filter to all channels. The Pointwise Convolution consists of combining the outputs of the Depthwise Convolution. It is also called 1 × 1 convolution.