Alzheimer’s disease (AD) is one of the most devastating brain diseases in the world, especially in the more advanced age groups. It is a progressive neurological disease that results in irreversible loss of neurons, particularly in the cortex and hippocampus, which leads to characteristic memory loss and behavioral changes in humans. Communicative difficulties (speech and language) constitute one of the groups of symptoms that most accompany dementia and, therefore, should be recognized as a central study instrument. This recognition aims to provide earlier diagnosis, resulting in greater effectiveness in delaying the disease evolution. Speech analysis, in general, represents an important source of information encompassing the phonetic, phonological, lexical-semantic, morphosyntactic, and pragmatic levels of language organization [72]. The first signs of cognitive decline are quite present in the discourse of neurodegenerative patients so that diagnosis via speech analysis of these patients is a viable and effective method, which may even lead to an earlier and more accurate diagnosis.
1. Speech and Language Impairments in Alzheimer’s Disease
Alzheimer’s disease (AD) is one of the most devastating brain diseases in the world, especially in the more advanced age groups [
8]. It is a progressive neurological disease that results in irreversible loss of neurons, particularly in the cortex and hippocampus, which leads to characteristic memory loss and behavioral changes in humans [
9].
Although the nature of AD is unknown and is likely to be a multiple-cause disease, it has been observed that its onset is insidious and appears in adulthood, causing, in advanced stages, a cognitive and behavioral disability [
10].
As the disease progresses, the quality of life of patients is deeply affected in different ways. As they lose cognitive abilities and functional skills, individuals with this dementia become unable to perform many of the activities that were usually part of their daily lives. Behavior and social skills may also deteriorate, precipitating interpersonal conflicts that lead to the individual with AD being socially isolated. This, in turn, has an impact on their emotional state [
11]. In these syndromes, amnesic symptoms may not be the first evidence, but others, more prominent initial aspects, such as language problems, visual dysfunction, or difficulties with praxis [
12].
Mild cognitive impairment (MCI) is known to be one of the first detectable indicators of cognitive decline. It is a heterogeneous syndrome that shows great clinical importance for the early detection of AD [
13]. At this stage, the symptoms related with the ability to think begin to be noticed by the individual himself and by his closest members, but there are no functional changes in its daily life. Not all patients diagnosed with MCI develop AD, in fact, only 10 to 15% per year. There are two types of MCI, the amnesic and the non-amnesic. Patients with the first type are thought to have a greater tendency to develop AD. In cases where they do, MCI is considered the second phase of AD [
14]. In general, the MCI captures the point in the spectrum of cognitive function between non-dementia aging and dementia with main characteristics for the amnesic type [
15].
The general diagnosis of neurodegenerative diseases is usually compromised by the fact that the symptoms that trigger it represent an advanced stage of the disease, causing it to appear late. Therefore, the assessment of dementia should be based on four key issues: (1) whether there is a subjective disability detected by the individual himself or observed by a close individual; (2) whether there is objective evidence of cognitive disability in the tests performed; (3) whether there is a functional decline; (4) whether there are symptoms caused by something inherent in dementia (e.g., delirium, substances or other medical, neurological or psychiatric disorders). To answer these questions, a medical history is acquired, and appropriate physical examinations and laboratory studies are performed, as well as cognitive screenings, that also use neuroimaging techniques [
15]. Within cognitive tests, it stands out the Mini-Mental State Exam (MMSE), the Clock-drawing test, and the Alzheimer’s Disease Assessment Scale [
12,
16,
17]. The main exams using imaging techniques are Computed Axial Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and Single-Photon Emission Computed Tomography (SPECT) [
15]. Although there is currently a wide range of diagnostic methods applied to AD, there is still a concern to find new methods that respond more urgently to dementia while being simple and cost effective.
Alzheimer’s disease is characterized by a progressive worsening of deficits in several cognitive fields, including language. Aphasia and dysarthria are common symptoms and language impairment in AD occurs mainly due to a decline in semantic and pragmatic levels of language processing [
18]. From a physiological perspective, superior parietal, posterior temporal, and occipital cortical areas are interconnected by posterior corpus callosum. The superior longitudinal fasciculus surrounds the putamen, connecting all four cerebral lobes, areas that are known to be affected in MCI and AD and that have a central role in language processing [
19,
20]. Language difficulties are a major problem for most patients with dementia, especially as the disease progresses. The first signs that communication is being affected are the difficulties on finding words, especially when it comes to naming familiar people or objects. Words are replaced by wrong and meaningless words and pauses during speech are increased as well [
21]. In the early stages of AD, language impairment involves problems of lexical recovery, loss of verbal fluency, and a breakdown in higher-order written and spoken language comprehension. In the moderate and severe phases of AD, the loss of verbal fluency is profound, with loss of understanding and prominent literal and semantic paraphrases. In the very severe phases of AD, speech is often restricted to echolalia and verbal stereotypes. In
Table 1, it is possible to see the association of the mentioned speech impairments with the stage of the disease [
18,
22]. Communicative difficulties (speech and language) constitute one of the groups of symptoms that most accompany dementia and, therefore, should be recognized as a central study instrument. This recognition aims to provide earlier diagnosis, resulting in greater effectiveness in delaying the disease evolution.
Table 1. Language changes in AD (adapted from Ferris and Farlow [
18] and Greta et al. [
23]).
Function |
Early Stages |
Moderate to Severe Stages |
Spontaneous Speech |
Fluent, grammatical |
Non-fluent, echolalic |
Paraphrastic errors |
Semantics |
Semantic and phonetic |
Repetition |
Intact |
Very affected |
Naming objects |
Slightly affected |
Very affected |
Understanding the words |
Intact |
Very affected |
Syntactical understanding |
Intact |
Very affected |
Reading |
Intact |
Very affected |
Writing |
± |
Intact |
Very affected |
Semantic knowledge of words and objects |
Difficulties with less used words and objects. |
Very affected |
2. Speech- and Language-Based Classification of Alzheimer’s Disease
2.1. Machine Learning Pipeline
The use of speech analysis is potentially a useful, non-invasive, and simple method for early diagnosis of AD. The automation of this process allows a fast, accurate, and economical follow-up over time. Initially, speech-based tests for AD detection were performed by linguists. These tests were designed to extract linguistic characteristics from speech or writing samples. However, more current studies seek to optimize this task by automating the process of speech recognition through audio recordings [
29]. Thus, and in sequence, the process can be described in 4 crucial steps:
-
Data Preparation: In this step the extraction, optimization and normalization of features occurs. This consists in the selection of the most significant features (by removal of the non-dominant features) and in the transformation of ranges to similar limits, which will reduce training time and the complexity of the classification models. Metadata are “the data of the data”, more specifically, structured, and organized information on a given object (in this case voice recordings) that allow certain characteristics of it to be known. This metadata together with the results of the pre-processing of the recordings makes the final database. Incorrect or poor-quality data (e.g., outliers, wrong labels, noise, …), if not properly cared for, will lead to under optimized models and to unsatisfactory results. If data is not enough, for example when deep learning algorithms are used, then data augmentation techniques can be useful.
-
Training and Validation: The supporting database is divided into subsets, usually 70–90% for training and 30–10% for testing. The subsets can be randomly generated several times and the results can be averaged for additional confidence in the results, a procedure that is designated by cross-validation. The data model is trained, i.e., the involved parameters are adjusted, by one or many optimizers, and the performance is calculated using the test subset. This step allows categorizing and organizing the data to promote better analysis [
30]. When data is not enough, then transfer learning approaches can be used.
-
Optimization: After model evaluation, it is possible to conclude on the parameters that need to be improved, as well as to proceed in a more effective way to the selection of the most interesting and relevant features, so that a new extraction and consequently a new process (iteration) of Training and Validation can be performed.
-
Run-Time: Having concluded the previous points, the system is ready to be deployed and to classify new unseen inputs. More specifically, from the recording of a patient’s voice, to classify it as possible healthy or possible Alzheimer’s patient.
In Figure 3 we can observe the described methodology in detail.
Figure 3. Flowchart of a general machine learning pipeline to process acoustic/prosodic correlates of disease. Adapted from Braga et al. [
31].
2.2. Speech and Language Resources
Table 2 presents the main databases that are referred in the scientific literature, accompanied by a summary of their characteristics. These resources are crucial for supporting the development of new systems, in particular when deep learning approaches are used. The use of similar databases in different studies, by different researchers, also provides a common ground for evaluation and performance comparison.
Table 2. List of databases, with related specifications, with Alzheimer’s patients’ speech recordings. (Table contents are sorted by language, first column, and database name, second column).
Language |
Database Name |
Task |
Population |
Availability |
Refs. |
HC M/F |
MCI M/F |
AD M/F |
English |
DementiaBank (TalkBank) |
DF |
99 |
- |
169 |
Upon request |
[32] |
English |
Pitt Corpus |
PD |
75/142 |
27/16 |
87/170 |
Upon request |
[33] |
English |
WRAP |
PD |
59/141 |
28/36 |
- |
Upon request |
[34] |
English |
- |
PD |
112 |
- |
98 |
Undefined |
[35] |
French |
- |
Mixed |
6/9 |
11/12 |
13/13 |
Undefined |
[36] |
French |
- |
VF, PD, SS Counting |
- |
19/25 |
12/15 |
Undefined |
[37] |
French |
- |
VF, Semantics |
5/19 |
23/24 |
8/16 |
Undefined |
[38] |
French |
- |
Reading |
16 |
16 |
16 |
Undefined |
[39] |
Greek |
- |
PD |
16/14 |
- |
13/17 |
Undefined |
[40] |
Hungarian |
BEA |
SS |
13/23 |
16/32 |
- |
Upon request |
[13] [41] |
25 |
25 |
25 |
Italian |
- |
Mixture |
48 |
48 |
- |
Undefined |
[42] |
Mandarin |
Lu Corpus |
PD/SS |
4/6 |
- |
6/4 |
Upon request |
[43] |
Mandarin |
- |
PD/SS |
24 |
20 |
20 |
Undefined |
[44] |
Portuguese |
Cinderella |
SS |
20 |
20 |
20 |
Undefined |
[45] |
Spanish |
AZTITXIKI (AZTIAHO) |
SS |
5 |
- |
5 |
Undefined |
[46] |
Spanish |
AZTIAHORE (AZTIAHO) |
SS |
11/9 |
- |
8/12 |
Undefined |
[47,48] |
Spanish |
PGA-OREKA |
VF |
26/36 |
17/21 |
- |
Upon request |
[47] |
Mini-PGA |
PD |
4/8 |
- |
1/5 |
Spanish |
- |
Reading |
30/68 |
- |
14/33 |
Undefined |
[49] |
Swedish |
Gothenburg |
PD |
13/23 |
15/16 |
- |
Undefined |
[50] |
Swedish |
- |
Mixed |
12/14 |
8/21 |
- |
Upon request |
[51] |
Swedish |
- |
Reading |
11/19 |
12/13 |
- |
Undefined |
[52] |
Turkish |
- |
SS/Interview |
31/20 |
- |
18/10 |
Undefined |
[53] |
Turkish |
- |
SS/Interview |
12/15 |
|
17/10 |
Undefined |
[54] |
Turkish |
- |
SS |
12/15 |
- |
17/10 |
Undefined |
[55] |
This entry is adapted from the peer-reviewed paper 10.3390/bioengineering9010027