Parkinson’s Disease (PD) according to the World Health Organization is a degenerative condition of the brain, associated with motor symptoms (e.g., tremor, imbalance, slow movement, and FoG) as well as non-motor symptoms (e.g., insomnia, cognitive impairment, pain, and sensory disturbances)
[1]. The symptoms usually emerge slowly, and as the disease worsens, non-motor symptoms become more apparent
[2]. Early symptoms are tremors, rigidity, slowness of movement, and walking difficulties
[3]. Issues during the disease progression include cognition, mood, sleep disorders (also as prodromal signs), and various sensory systems deficits
[4]. FoG is a unique and higher gait disorder in advanced PD patients defined as a “brief episodic absence or marked reduction of forwarding progression of the feet despite the intention to walk”
[5]. The symptom lasts a couple of seconds or more and poses many difficulties to clinicians in understanding its exact mechanisms and finding a proper treatment
[6][7]. About half of the people with PD exhibit freezing of gait episodes, where the most common and initial symptoms are trembling in place with no motion, shuffling, or hastening, and total akinesia with tremor and hastening. FoG occurs during the gait initiation and turning but manifests in constraints like walking through a narrow path, doorways, dual tasking, etc., which are different for each individual.
FoG is recognized as one of the most critical debilitating motor symptoms of advanced PD, presents a higher rate of occurrence in aged people while, as elaborated previously, its episodes are random in time and subjective to each person at that occasion, and under those circumstances it manifests. Thus, the inherent difficulties and the random nature of FoG manifestation, in tandem with the need for an experienced clinician’s presence (while data acquisition occurs) to verify and annotate the data (time point of occurrence), exhibit the limitations in collecting large volumes of FoG-related data. It is apparent that data augmentation via synthetic data creation for prediction/classification purposes is more than critical towards robust and generic model development. Such tools, given the technological evolution, can be provided by computer science, i.e., AI methods. The main hindrance to such technologies is the limited availability of data in order to be sufficiently robust and efficient.
The availability of data in the healthcare domain is crucial and, in many cases, due to many reasons such as privacy legislation, there is difficulty in data gathering, with data being scarce, unstructured or of low quality. Especially in PD, another limitation concerns the patients who cannot provide daily or periodically unbiased and exact data in a systematic way due to motor impairment and other symptoms coupled with the frequency the patients visit and report to their doctors. Also, the wide spectrum of data that can be collected for PD patients imposes another difficulty for computer scientists to develop tools to detect PD in early stages. The heterogeneity and the scarcity of PD data are a major concern for state-of-the-art technologies as they can hinder such “smart” solutions due to inefficient training. This leads to the necessity of alternative ways for data augmentation by engaging state-of-the-art technology such as GANs.
2. GAN-Based Applications in PD Diagnosis and Treatment
There are many examples in the current scientific literature regarding the use of GANs for the generation of synthetic data (e.g., tabular data, image data, and audio data) related to the health domain. In
[11], Hargreaves and Heng examined the potential of using GANs to generate synthetic tabular diabetes data. The authors built two models for the classification of patients as having diabetes or not, where the first utilized real data only, while the second made use of a combination of real and GAN-based synthetic data. The second model achieved a classification accuracy of 87.0%, yielding an increase of 8.3% as compared to the first model. The synthetic data were very similar to the original dataset and were found capable of replacing real data in research applications. Choi et al.
[12] combined GANs and Autoencoders to generate the medical GAN (medGAN) model which was capable of producing realistic synthetic patient records. The model could handle both binary variables and high-dimensional discrete variables. Aiming to increase the learning efficiency and to solve mode collapse, the authors also proposed a minibatch averaging method. Experimental evaluation of the synthetic data showed comparable results to models using real data only, while also helping to protect privacy, presenting a very small risk of identity/attribute disclosure. Aiming to generate more accurate synthetic data with regard to both discrete and continuous variables, Bowaly et al.
[13] proposed two variations of medGAN. The first model was called medical Wasserstein GAN (medWGAN) and integrated the Wasserstein GAN with the Gradient Penalty
[14] model, while the second was called the medical boundary-seeking GAN (medBGAN) and integrated the Boundary-seeking GAN
[15] model. In both models, the generator and discriminator consisted of feed-forward neural networks. The proposed alterations were found to outperform medGAN in all test scenarios, utilizing the MIMIC-III
[16] and Taiwanese National Health Insurance Research Database
[17].
Yang et al.
[18] proposed the so-called Grouped Correlational GAN (GcGAN) model for generating realistic synthetic Electronic Health Records (EHRs). The model took into consideration the meaning of diverse variables it contained as well as the correlations among them. The authors utilized spectral normalization on the discriminator as well as batch normalization on the generator. In terms of the percentage of the qualified synthetic data, it reached 95.21% during experimental evaluation, outperforming other state-of-the art approaches such as medGAN, wGAN
[19], ehrGAN
[20], and CorrGAN
[21]. Yoon et al.
[22] proposed the (Anonymization through Data Synthesis GAN) ADS-GAN model for generating synthetic EHRs, by closely approximating the joint distributions of the used variables. Setting the patient’s privacy as a priority, the authors highlighted that the model minimized the possibility of identifying a patient based on the data which were present in the original dataset. The model was also very reliable in joint distribution and consistently outperformed other contemporary approaches such as the PATE-GAN
[23], DP-GAN
[24], and MedGAN. Wang et al.
[25] proposed the so-called Sequentially Coupled Generative Adversarial Network (SC-GAN) for generating synthetic data relevant to both the patient state and the medication dosage. The model made use of two coupled generators (LSTM with two layers), the first about the patient state and the second about the medication dosage. The authors underlined that the patient state and medication dosage were strongly interrelated. SC-GAN was tested experimentally, outperforming other models (e.g., SeqGAN
[26], and C-RNN-GAN
[27]) in the medication dosage recommendation task with regard to the precision and AUROC metrics. In
[28], Beaulieu-Jones et al. demonstrated the Auxilliary Classifier GAN (AC-GAN) to specify the treatment class of patients as standard or intensive. The generator in the specific model made use of the noise vector and actually knew the type of treatment class it needed to create. Differential privacy was also applied during the generation of synthetic patient data, thus helping to reduce the chance of identifying a patient based on the original data. The model was tested experimentally, proving that it can help perform hypothesis-generating analyses, with limited original trial data.
GANs are also particularly useful for augmenting time series data as well as health-related signals. Esteban et al.
[29] proposed the so-called Recurrent Conditional GAN (RCGAN) which aimed at generating real-valued high-dimensional time series and focused on medical data. The generator and the discriminator encompassed Recurrent Neural Networks (RNNs) which were conditioned on auxiliary information. The synthetic data which were produced included time series and associated labels. The results generated by the synthetic data related to Internal Care Unit (ICU) patients were found to be comparable to those produced based on real data only, reaching 0.96 with regard to the AUROC metric as compared to 0.9908.
Kiyasseh et al.
[30] proposed the so-called PlethAugment model encompassing three Conditional GAN
[31] models with an adapted diversity term. Aiming to improve the classification performance, PlethAugment focused on producing pathological photo-plethysmogram (PPG) signals. With regard to the AUROC metric, the use of the generated synthetic dataset yielded a 29% increase as compared to the original class-balanced datasets. Brophy et al.
[32] proposed a GAN-based model called Multivariate GAN (MV-GAN) for generating realistic multichannel electrocardiogram (ECG) signals. By utilizing minibatch discrimination (MBD) in the GAN architecture, the authors avoided the mode collapse problem and could generate multivariate time series. Experimental testing indicated that the synthetic datasets generated were structurally similar to the original datasets with satisfactory diversity among the different samples, while also ensuring protection of the patients’ privacy. Hazra and Byun
[33] proposed a GAN-based model which had the main goal of automating and improving medical diagnosis as well as of enriching the training of medical students by utilizing realistic data. The so-called SynSigGAN model made use of Bidirectional Long Short Term Memory networks (BiLSTM) for the generator network and (Convolutional Neural Networks (CNNs) for the discriminator. SynSigGAN was used for the generation of ECG, electroencephalogram (EEG) as well as of electromyography (EMG) and photoplethysmography (PPG) signals. Experimental testing of the proposed model indicated its potential in producing realistic results with high correlation between the original and the synthetic signals. The model also outperformed other contemporary models (e.g., LSTM-AE
[34], BiLSTM-MLP
[35], and RNN-AE GAN
[36]) in terms of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), achieving the best results (0.25 and 0.36, respectively).
Privacy is a major concern in applications encompassing data augmentation with GANs. Torfi et al.
[37] proposed a framework for the synthetic generation of data, making use of the Rényi differential privacy. The authors utilized convolutional autoencoders and Convolutional GANs (CGANs) and were capable of capturing feature correlations and temporal information in the original datasets. Experimental testing of the framework highlighted its capability in generating realistic synthetic data. The framework was also found to outperform other state-of-the art approaches (e.g., MedGAN, TableGAN
[38], GAN, and PATE-GAN) in terms of Area Under the Precision-Recall Curve (AUPRC) metric, reaching 0.93. Chin-Cheong et al.
[39] proposed two WGAN variants for the generation of realistic heterogeneous EHRs. The first variant demonstrated quite satisfying results regarding data fidelity and data utility. More specifically, the AUROC and AUPRC metrics were comparable to the use of real data, reaching 0.7536 and 0.7747, respectively, for data utility as compared to 0.8003 and 0.8245 for the real data. The second variant also applied a differential privacy model, ensured better privacy, but had worse results (0.6427 AUROC and 0.6776 AUPRC) which, however, were still usable for ML tasks.
There are many GAN-based applications used specifically in the domain of PD diagnosis and treatment. Kaur et al.
[40] proposed a model which combined GANs and Deep Neural Networks (DNNs). Early detection of the PD by voice analysis (e.g., analysis of voice strength, articulation rate, long pauses, and pitch rate) and classification. Initially, GANs were used to expand the original training dataset by generating synthetic data. These data were then used for PD classification. Experimental testing of the model highlighted that data augmentation through GANs improved the model’s accuracy and specificity as compared to the use of the original data without data augmentation. More specifically, an 88% accuracy and 87.14% specificity were achieved as compared to 84.67% and 83.76%, respectively, when no data augmentation was used.
Voiceprints used for distinguishing PD patients and healthy individuals were utilized by Xu et al.
[41]. More specifically, the authors proposed a GAN-based model, which was combined with different models and classification models, helping train the aforementioned models even when very limited data were available. The so-called Spectrogram Deep Convolutional GAN (S-DCGAN) was capable of producing high-resolution spectrograms by means of increasing the number of network layers as well as by implementing a spectral normalization method and a feature-matching method. Experimental testing, using a ResNet50 model
[42], achieved a high accuracy of 91.25% in voiceprint classification and a 92.5% specificity. The authors also highlighted that the data augmentation also played a significant role in these results as the application of ResNet without the use of GANs for data augmentation resulted in much worse results (75.5% accuracy and 72.5% specificity).
Contrary to most methodologies for classifying healthy individuals and patients with dementia which focuses on one type of dementia only, Noella and Priyadarshini
[43] proposed a system which can help in the diagnosis of different types of dementia. More specifically, Brain Fluorodeoxyglucose Positron Emission Tomography (FDG-PET) scans were utilized to diagnose PD, Alzheimer’s disease and Frontotemporal Dementia. GANs were used for the generation of synthetic (Neuroimaging in Frontotemporal Dementia NIFD) samples to solve uniform distribution problems in the images used for training the Deep Convolutional Neural Network (DCNN) classification model. The proposed system was tested experimentally, yielding 97.7% accuracy, 97% specificity, and 97% sensitivity.
In
[44], Zanini and Colombini proposed two methodologies for augmenting the EMG signals of PD patients. The first methodology was based on Deep Convolutional GANS (DCGANs). In this case, the generator simulated the EMG tremor pattern of each patient. The discriminator of this methodology was also used on the second methodology for augmenting EMG signals, making use of neural style transfer. Experimental testing of the methodologies indicated their capability of adapting to different tremor frequencies and amplitudes of patients. The methodologies could also help extend tremor patterns to diverse movement protocols and scenarios.
Kaur et al.
[45] demonstrated an approach for classifying Magnetic Resonance (MR) images as belonging to PD patients or healthy people. The approach was based on DCNNs for the classification task, while a GAN model was used for data augmentation, addressing the issue of the limited size of the available training dataset. The authors applied preprocessing of the MR images and transfer learning was implemented to the pre-trained Alex-Net architecture. The last layers of the model were replaced with new categories of images, tailored to the needs of PD classification. Experimental testing of the authors’ approach yielded a classification accuracy of 89.23%. The analysis of digital drawing tests could help in the diagnosis of PD as well as in the investigation of graphomotor impairment in PD patients
[46]. Towards this direction, Dzotsenidze et al.
[47] proposed a framework for conducting PD diagnostics based on digital drawings, utilizing CNNs for classification purposes combined with GANs for data augmentation. More specifically, four different GAN architectures (i.e., ProjectedGAN
[48], StyleGAN3
[49], StyleGAN2-ADA
[50], and StyleGAN2-ADA + LeCam
[51]) were used and evaluated for generating synthetic digital drawing tests. Regarding the sensitivity metric, ProjectedGAN reached 96.6% in some test scenarios, and the authors highlighted that the use of GANs could help face data scarcity regarding digital drawing tests and contribute to better decision making for doctors.
GANs can also be used in applications relevant to the FoG symptom. Ramesh and Bilal
[52] presented a model utilizing GANs and CNNs for predicting the Postural Instability and Gait Disorder (PIGD) score of PD patients wearing a single inertial sensor. The specific score was calculated, utilizing different scores related to the posture and the gait of the patient (e.g., FoG, posture, and gait). The model was also able to classify the ON/OFF states of a PD patient, with the ON state referring to when a patient has been treated with a dopamine precursor drug and the OFF state referring to the same patient when the specific drug has started to wear-off, followed by a worsening of motor symptoms
[53]. The authors used data from different clinics for the training and testing of their model. The experimental results indicated that the CNN model using GANs outperformed the CNN model where no GANs were utilized, yielding an accuracy improvement of up to 22% in determining the ON/OFF states and even outperformed clinicians in determining the ON/OFF states making use of the PIGD scores. Yu et al.
[54] demonstrated an approach utilizing GANs and the Hidden Markov Model (HMM)
[55] for classifying whether or not to activate devices which protect patients with chronic diseases from falling. The authors’ approach alleviated many problems present in models for chronic conditions such as relying on manual feature engineering and omitting temporal dependencies. The so-called HMM-GAN model was capable of capturing independent and sequential data from sensors which followed diverse distributions. Experimental testing of the model under both supervised and semi-supervised settings showcased increased accuracy in predicting if the protective equipment should be triggered or not. Specifically, the model’s accuracy reached 93.07% in the supervised mode and 94.82% in the semi-supervised mode. Their approach can also be used for recognizing FoG symptoms.