Your browser does not fully support modern features. Please upgrade for a smoother experience.

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Hala Alshamlan	--	1057	2023-05-31 14:09:30	\|
2	format correct	Jessie Wu	+ 6 word(s)	1063	2023-06-01 04:35:59	\| \|
3	a	Jessie Wu	Meta information modification	1063	2023-06-01 04:51:39	\| \|
4	a	Jessie Wu	Meta information modification	1063	2023-06-01 05:00:25	\| \|
5	a	Jessie Wu	Meta information modification	1063	2023-06-01 05:06:20	\| \|
6	format correct	Jessie Wu	-3 word(s)	1060	2023-06-02 08:48:05	\|

Video Upload Options

We provide professional Academic Video Service to translate complex research into visually appealing presentations. Would you like to try it?

No, upload directly Yes

Cite

If you have any further questions, please contact Encyclopedia Editorial Office.

Select a Style

Alshamlan, H.; Omar, S.; Aljurayyad, R.; Alabduljabbar, R. Alzheimer’s Disease, Machine Learning and Feature Selection Methods. Encyclopedia. Available online: https://encyclopedia.pub/entry/45065 (accessed on 26 July 2026).

Alshamlan H, Omar S, Aljurayyad R, Alabduljabbar R. Alzheimer’s Disease, Machine Learning and Feature Selection Methods. Encyclopedia. Available at: https://encyclopedia.pub/entry/45065. Accessed July 26, 2026.

Alshamlan, Hala, Samar Omar, Rehab Aljurayyad, Reham Alabduljabbar. "Alzheimer’s Disease, Machine Learning and Feature Selection Methods" Encyclopedia, https://encyclopedia.pub/entry/45065 (accessed July 26, 2026).

Alshamlan, H., Omar, S., Aljurayyad, R., & Alabduljabbar, R. (2023, May 31). Alzheimer’s Disease, Machine Learning and Feature Selection Methods. In Encyclopedia. https://encyclopedia.pub/entry/45065

Alshamlan, Hala, et al. "Alzheimer’s Disease, Machine Learning and Feature Selection Methods." Encyclopedia. Web. 31 May, 2023.

Alzheimer’s Disease, Machine Learning and Feature Selection Methods

Edit

This entry is adapted from the peer-reviewed paper 10.3390/diagnostics13101771

Alzheimer’s disease (AD) is a prevalent form of dementia that accounts for up to 80% of all dementia cases. The use of machine learning and feature selection methods in predicting AD based on gene expression data is a rapidly evolving area of research.

data mining genetic disease prediction Alzheimer disease gene expression feature selection classification algorithm

1. Alzheimer’s Disease

Alzheimer’s disease (AD) is a progressive brain disorder that was first described by Dr. Alois Alzheimer in 1906. Dr. Alzheimer observed symptoms in his patient, such as memory loss, paranoia, and psychological changes, and upon autopsy, he noticed shrinkage of the patient’s brain ^[1]. AD is the most common cause of dementia, which is a condition that slowly destroys memory and cognitive functioning, ultimately impacting the ability to carry out daily activities ^[2].

Currently, AD is ranked as the sixth leading cause of death worldwide, and the symptoms typically appear in individuals over the age of 60 ^[2]. In Saudi Arabia, experts estimate that 3.23% of the population, mostly aged 65 or older, may have dementia caused by AD ^[3]. Despite extensive research efforts, there is currently no cure or definitive treatment for AD. Current approaches to managing the disease focus on helping individuals maintain cognitive function, manage behavioral symptoms, and slow the progression of memory loss ^[2]. However, researchers are actively pursuing therapies that target specific genetic, molecular, and cellular mechanisms in the hopes of stopping or preventing the underlying cause of the disease ^[2]. The complex nature of AD makes it a challenging condition to treat and manage. However, continued research and innovative approaches may lead to more effective treatments and improved outcomes for individuals with AD and their families.

2. Supervised Machine Learning

Supervised Machine Learning (SML) is when a machine is programmed to find particular patterns in massive data. SML has different ways to adjust these data by adjusting the algorithm to make predictions and many other tasks ^[4]. The term is directly related to the fields of programming, IT, and mathematics. It is applied in all types of sectors of government, marketing, medicine, and any business which collects data and wants to make a decision based on these data. Subsequently, it is employed in consumer choices, weather forecasting, and website calculations. Researchers concentrate on various types of SML ^[4]. Although there are many, many categories and aspects to SML, we would only generally describe the following: Support Vector Machine, Logistic Regression, Linear Discriminant Analysis, K-nearest neighbor, Decision Tree, and Naïve Bayes.

Support Vector Machine (SVM)

SVM is a discriminative classifier formally defined by a separating hyperplane. The algorithm outputs an optimal hyperplane which categorizes new examples. In two-dimensional space, this hyperplane is a line dividing a plane into two parts. Each class is separated on each side of the plane ^[4]. A hyperplane is a line that linearly separates and classifies a set of data. Generally, the further from the hyperplane our data points lie, the more confident we are that they have been correctly classified. Hence, when new testing data are added, whatever side of the hyperplane they land on will decide the class that we assign to them ^[4].

Given the solutions

{\hat{β}}_{0}

and

\hat{β}

, the decision function can be written as

\hat{G} (x) = sign [\hat{f} (x)] = sign [x^{T} β + {\hat{β}}_{0}]

One aspect of SVM is its accuracy. SVM works well on smaller cleaner datasets. It can be more efficient because it uses a subset of training points. The cons are that it is not suited to larger datasets, as the training time with SVM can be high ^[4].

3. Feature Selection

Feature selection is a process of removing irrelevant features in a dataset, where the chosen algorithm automatically selects those features that contribute most to the prediction variable or output in which one is interested. Using feature selection before fitting data into the classifier can enhance accuracy by reducing training time and overfitting.

3.1. mRMR

This stands for minimum Redundancy Maximum Relevance. mRMR aims to select the genes that have shown low correlation among the other genes (Minimum Redundancy) but still have high correlation to the classification variable (Maximum Relevance). For classes c = (ci, … ck) the Maximum Relevance condition is to maximize the total relevance of all features in

{m a x}_{S \subset Ω} \frac{1}{| S |} \sum_{i \in S} I (c, f_{i})

The Minimum Redundancy condition is

{m i n}_{S \subset Ω} \frac{1}{{| S |}^{2}} \sum_{i, j \in S} I (f_{i}, f_{j})

where

f_{i}

and

f_{j}

refer to the expression levels of genes i and j, respectively.

3.2. CFS

CFS stands for Correlation-based Feature Selection algorithm. CFS selects attributes by using a heuristic which measures the usefulness of individual genes for predicting the class label along with the level of inter-correlation among them. Highly correlated and irrelevant features are avoided. The method calculates the merit of a subset of k features as:

{M e r i t}_{S_{k}} = \frac{{k r}_{c f}}{\sqrt{k + k (k - 1) \bar{r_{f f}}}}

Here,

\bar{r_{c f}}

is the average value of all feature–classification correlations, and

\bar{r_{f f}}

is the average value of all feature–feature correlations. The CFS criterion is defined as follows:

C F S = {m a x}_{S_{k}} [\frac{r_{c f 1} + r_{c f 2} + \dots + r_{c f k}}{\sqrt{k + 2 (r_{f 1 f 2} + \dots + r_{f i f j} + \dots + r_{f k f 1})}}]

3.3. Chi-Square Test

The Chi-Square Test is a statistical algorithm used by classification methods to check the correlation between two variables. In the following equation, high scores on χ₂ indicate that the null hypothesis (H0) of independence should be eliminated and thus that the occurrence of the term and class are dependent:

X^{2} = \sum \frac{{(o b s e r v e d - e x p e c t e d)}^{2}}{e x p e c t e d}

3.4. F-Score

F-score is a simple statistical algorithm for feature selection. F-score can be used to measure the discrimination of two sets of real numbers.

3.5. GA

Genetic Algorithm (GA) is one of the common wrapper gene selection methods. It is usually applied to discrete optimization problems. The main goal of GA is discovering the best and perfect solution within a group of potential solutions. This method reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. Each set of solutions is named a population. Populations consist of vectors, i.e., chromosomes or individuals. Every item in the vector is referred to as a gene ^[5].

References

History of Alzheimer’s: Major Milestones. Available online: https://www.alzheimers.net/history-of-alzheimers (accessed on 3 October 2019).
Seniors’ Health—Overview of Alzheimer’s. Available online: https://www.moh.gov.sa/en/HealthAwareness/EducationalContent/Health-of-Older-Persons/Pages/Overview-of-Alzheimer.aspx (accessed on 3 October 2019).
ا SaudiAlzaheimer’s Disease Association. Available online: http://alz.org.sa/ (accessed on 3 October 2019).
Hastie, T.; Tibshirani, R.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009.
Babaoglu, I.; Findik, O.; Eílker, A. Comparison of Feature Selection Models Utilizing Binary Particle Swarm Optimization and Genetic Algorithm in Determining Coronary Artery Disease Using Support Vector Machine. Expert Syst. Appl. 2010, 37, 3177–3183.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Computer Science, Artificial Intelligence

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register : Hala Alshamlan , Samar Omar ,

Rehab Aljurayyad

, Reham Alabduljabbar

View Times: 585

Update Date: 02 Jun 2023

Table of Contents

Notice

You are not a member of the advisory board for this topic. If you want to update advisory board member profile, please contact office@encyclopedia.pub.

Confirm

Only members of the Encyclopedia advisory board for this topic are allowed to note entries. Would you like to become an advisory board member of the Encyclopedia?

Yes

${ textCharacter }/${ maxCharacter }

Submit

Cancel

There is no comment~

${ textCharacter }/${ maxCharacter }

Submit

Cancel

${ selectedItem.replyTextCharacter }/${ selectedItem.replyMaxCharacter }

Submit

Cancel

Confirm

Are you sure to Delete?

Yes No