Graph Neural Networks for Parkinson’s Disease: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor: , , ,

Graph neural networks (GNNs) have been increasingly employed in the field of Parkinson’s disease (PD) research. The use of GNNs provides a promising approach to address the complex relationship between various clinical and non-clinical factors that contribute to the progression of PD. 

  • knowledge graphs
  • graph neural networks
  • Parkinson’s disease

1. Introduction

Categorized as a progressive neurodegenerative disease, PD exerts its impact on millions of individuals globally. Despite extensive scholarly inquiry, the precise etiology of PD remains largely elusive, and the currently available therapeutic regimens exhibit constrained effectiveness in retarding or arresting the disease’s relentless progression. Recent times have witnessed a burgeoning emphasis on the utilization of data-driven methodologies, specifically machine learning (ML) and deep learning (DL) ones, as instruments for probing into the fundamental mechanisms underpinning PD and augmenting the efficacy of therapeutic interventions.
Neural networks (NNs), inspired by the human brain, learn patterns from data and are widely used in healthcare for tasks like disease diagnosis, drug discovery, and personalized treatment planning [1]. In PD research, they constitute a promising technology in predicting progression, categorizing stages, and detecting early signs [2]. However, NNs have limitations, including their ‘black box’ nature, big data demands, risks of overfitting or underfitting, and computational intensity, which can hinder interpretation, explainability, and generalization.
In contrast to traditional NNs, GNNs are particularly well-suited for graph-structured data, characterized by varying sizes and complexity of structure. GNNs excel at capturing intricate relationships within graphs, making them highly effective in tasks such as node classification, link prediction, and graph classification [3]. As a specialized form of DL, GNNs are designed explicitly for graph-structured data, commonly found in domains like social networks and molecular biology [4]. Their strength lies in their ability to incorporate both local and global structural information for more accurate predictions, enabling them to uncover intricate relationships among entities.
Compared to traditional NNs, GNNs are preferred for the representation of graph-structured data. While NNs are designed for vector or sequence data, GNNs excel at analyzing complex graphs by leveraging node and edge relationships. They are particularly useful in domains with inherent graph-based data, such as traffic analysis, social networks, and recommendation systems [5]. In medical applications, including the PD domain, GNNs play a pivotal role in tasks like gene expression analysis, disease diagnosis, drug discovery, and brain imaging data analysis. Their graph-centric approach serves as a robust tool for the analysis of complex medical data, yielding novel insights and interpretations of disease diagnosis and treatment. In the context of neurodegenerative disorders like PD, a representative recent study introduced a GNN-based method to forecast PD progression by analyzing brain connectivity networks from MRI data [6]. Notably, this GNN model can identify changes in brain connectivity patterns that are predictive of disease progression, even in the early stage of PD, offering a promising approach for early diagnosis and improved treatment strategies [7]. Additionally, in another related work [8], PD was recognized as a common neurodegenerative ailment and a widespread condition influenced by a blend of genetic and environmental factors, contributing to the formation of abnormal protein aggregations in specific neuron groups, ultimately resulting in cellular dysfunction and degeneration. The clinical diagnosis of PD often relies on a careful evaluation to distinguish it from other parkinsonism-related disorders, necessitating a heightened level of clinical suspicion. A multitude of treatment modalities, including pharmaceutical agents and surgical interventions, are now available to address both early- and late-stage complications associated with PD.
Statistical reasoning in healthcare, especially in PD research, involves data-driven inferences, aiding in risk identification, intervention assessment, and uncovering progression patterns. Symbolic reasoning in PD healthcare research uses symbols for structured problem-solving, contributing to the understanding of complex relationships among symptoms, biomarkers, and treatments, enabling personalized treatment planning. It also extracts information from patient records to cover disease progression and treatment response.
Hybrid AI combines statistical and symbolic approaches, blending data-driven and knowledge-driven methods [9]. This blending addresses complex problems by leveraging the strengths of both approaches [10]. For example, probabilistic soft logic (PSL) unifies statistical and symbolic reasoning in social network analysis and natural language processing, and inductive logic programming (ILP) extracts symbolic rules from data for application in bioinformatics and expert systems [11].
The realm of GNNs in PD research is rapidly evolving, witnessing an upsurge in novel approaches. Recent years have witnessed growing interest in employing GNNs to explore PD’s underlying mechanisms, thanks to their robust data analysis capabilities. An inherent strength of GNNs in PD research lies in their adeptness at handling vast datasets from various sources, including patient-reported outcomes, imaging data, and EHRs. By integrating these datasets into a knowledge graph, GNNs can unveil intricate data patterns and relationships, providing insights into PD’s fundamental mechanisms. Furthermore, advancements have emerged in using GNNs for PD diagnosis, including the use of DP models to analyze imaging data and predict disease progression, showing promising results in PD detection and prognosis. These approaches hold the potential to improve patient well-being and reduce the healthcare burden of PD. In summary, GNNs in PD research continue to evolve, with new approaches frequently reported. This dynamic field offers promise for advancing our understanding of PD and enhancing patient outcomes, representing a promising area for future research.

2. Graph Convolutional Networks (GCNs)

Graph convolutional networks (GCNs) represent a specific category of GNNs tailored for the analysis and manipulation of graph-structured data. In the context of PD research, characterized by intricate relationships between diverse data sources like patient-reported outcomes, imaging data, and EHRs, GCNs emerge as a fitting choice. Within GCNs, the graph’s nodes serve as data points, while edges denote the connections between these points. Through the application of convolutional filters on these graph structures, GCNs efficiently process and scrutinize data, accounting for the interdependencies among different data points. This unique ability empowers GCNs to unveil intricate data patterns and relationships, thereby facilitating a deeper understanding of the fundamental mechanisms underlying PD.
A notable advantage of GCNs in PD research lies in their capacity to handle substantial volumes of data from varied sources, encompassing patient-reported outcomes, imaging data, and EHRs. This integration of data into a knowledge graph enables GCNs to unveil complex data patterns and relationships, offering valuable insights into the underlying aspects of PD. Recent advancements in applying GCNs to PD diagnosis have emerged, notably in deploying DL models to scrutinize imaging data and forecast disease progression. These approaches exhibit considerable promise, characterized by robust accuracy rates in detecting PD and predicting its progression. Such developments signify the potential to enhance patient outcomes and alleviate the strain that PD places on healthcare systems.
This research has introduced a method called multi-view graph convolutional network (MV-GCN) designed for predictive tasks related to PD. MV-GCN utilizes multiple brain graph inputs from diverse perspectives to enhance prediction accuracy [40]. Validation of the MV-GCN method is conducted using real-world data from the Parkinson’s Progression Markers Initiative (PPMI), which tracks disease progression in patients. The method’s effectiveness is assessed in predicting pairwise matching relationships within the context of PD, and the results highlight its promising performance in addressing this challenge.
Within the realm of skeleton-based action recognition, various studies have embraced diverse strategies to enhance the performance of recognition models. Some have incorporated attention mechanisms to emphasize discriminative joints within each frame, while others have employed a spatial–temporal graph convolutional network (ST-GCN) as the foundational framework [41]. An ST-GCN effectively captures both spatial and temporal information by introducing graph convolution for spatial features and conventional convolution for dynamic temporal information [42]. To further elevate performance, certain studies have proposed enhancements to the ST-GCN, such as cross-domain spatial residual layers and dense connection blocks. These innovations effectively handle spatial–temporal information and enhance feature robustness, respectively [43,44]. Additionally, another study introduced variable temporal dense blocks with varying kernel sizes to extract temporal features across different ranges [45].
In [46], the authors present a novel model, known as the crow-search-algorithm-based decision tree (CSADT), for the early diagnosis of PD. The proposed method was rigorously tested on four distinct PD datasets: meander, spiral, voice, and speech-Sakar. Key highlights of the CSADT model include data normalization, novel locations generated through the crow search algorithm, sub-feature selection using the sigmoid function, and decision tree analysis. The CSADT model achieved remarkable accuracy, with results indicating close to 100% accuracy and swift diagnosis, demonstrating its potential for early PD detection. The CSADT model achieved close to 100% accuracy in the diagnosis of Parkinson’s disease, making it a promising tool for early detection. The model’s success is attributed to its innovative approach, including data preprocessing, the crow search algorithm, and decision tree analysis. It outperformed other machine learning algorithms in terms of accuracy, precision, recall, and the combination measure F1. This innovation offers a reliable and rapid diagnosis of PD. In [47], ML and DL techniques were employed to identify blood-based biomarkers for Alzheimer’s and Parkinson’s diseases, with a specific focus on the promising performance of CNNs for biomarker identification and disease detection, which holds potential for early diagnosis and clinical trial screening.
In recent years, the field of neuroimage analysis has witnessed the development of numerous data mining techniques, with a particular surge in the popularity of DL models, attributed to their accomplishments in diverse computer vision applications [48]. An illustrative example is the work by Ktena et al. [49], which introduced a metric learning approach aimed at distinguishing between cases and controls in autism research. This innovative method involves the construction of a graph representing patients’ brain networks in regions of interest (ROIs) utilizing GCN. It leverages this graph to extract features from patients’ neuroimages.

3. Graph Attention Networks (GATs)

GATs, a subset of GNNs, integrate attention mechanisms to evaluate the significance of connections within a graph. They have proved highly advantageous for analyzing graph-structured data in the context of PD research, where intricate relationships among various data sources, such as patient-reported outcomes, imaging data, and EHRs, can be portrayed as a graph. Using GATs, each node within a graph receives an attention weight based on its associations with other nodes. This weight governs the node’s importance within the network, influencing the information flow between nodes. Consequently, GATs possess the ability to dynamically assess the importance of diverse connections within the graph and focus on critical information during DP.
A notable strength of GATs in PD research lies in their capacity to manage extensive and intricate graphs featuring numerous nodes and connections. Leveraging attention mechanisms to adaptively weigh connections, GATs effectively filter out extraneous or irrelevant data and prioritize essential information during DP. Recent advancements in the application of GATs for PD diagnosis have emerged, notably utilizing DL models for the analysis of imaging data and prognosis of disease progression. These approaches have exhibited promising outcomes, boasting high accuracy rates in PD detection and prognostication, which could significantly enhance patient outcomes and alleviate the burden of PD on healthcare systems.
Additionally, researchers have explored the utilization of GCNs in the concurrent analysis of structural and functional MRI data to classify autism. In one study (Arya et al. [50]), relational data between nodes were extracted from T1w structural metrics, while functional brain summaries were derived from fMRI data, and subsequently employed within a GCN model. Another study (Dsouza et al. [51]) introduced a multimodal GCN (M-GCN) framework for predicting phenotypic measures, amalgamating inputs from functional connectivity (FC) and subject-specific structural connectomes. Moreover, the GAT model has been explored for its potential interpretability in predicting phenotypic measures within a bipolar dataset (Yang et al. [52]), using the FC matrix as the graph and an anatomical and statistical FC feature set. These endeavors showcase the ongoing exploration of graph-based NN models to advance our understanding of neurological conditions like PD and autism while potentially enhancing patient care.
This study proposes a deep multi-modal fusion model (DMFM) based on GAT as the method for capturing spatial dependencies. GAT is harnessed to model these graphs, effectively incorporating spatial dependencies. Furthermore, a combination of global context information and the allocation of adjacent time importance is achieved by integrating convolutional long short-term memory (ConvLSTM) and an attention mechanism into a temporal attention mechanism (TAM) to model the spatiotemporal correlation [53]. Finally, a prediction module is employed for making the ultimate prediction.

4. Graph Recurrent Networks (GRNs)

Graph recurrent networks (GRNs) represent a subtype of GNNs that incorporate recurrent connections, facilitating the modeling of dynamic changes in graph-structured data. Within the context of PD research, where alterations in patient symptoms, imaging data, and EHRs unfold over time and can be represented as a dynamic graph, GRNs prove to be well-suited for processing such time-series data.
In GRNs, every node within a graph maintains connections with itself across multiple time steps, enabling the network to effectively capture the temporal evolution of the graph. This characteristic is particularly advantageous when dealing with time-series data in the PD domain, where shifts in patient symptoms and clinical indicators can be aptly characterized as dynamic graph structures [54].
A notable advantage of GRNs in the realm of PD research lies in their capacity to capture temporal dependencies among various data sources, encompassing patient-reported outcomes, imaging data, and EHRs. By effectively modeling the evolving nature of a graph over time, GRNs contribute to a more comprehensive understanding of disease progression, thereby assisting healthcare providers in gaining deeper insights into the underlying mechanisms of PD. Recent advancements in the deployment of GRNs for PD diagnosis involve the application of DL models to analyze time-series imaging data and predict disease progression. These endeavors have yielded promising outcomes, characterized by high accuracy rates in PD detection and prognosis, suggesting the potential to enhance patient outcomes and alleviate the burden of PD on healthcare systems. In summation, GRNs represent a robust tool for processing and analyzing time-series data within the scope of PD research. Their utility holds substantial promise for advancing our comprehension of this intricate condition and, consequently, for improving patient outcomes [37].

5. Graph Transformer Networks (GTNs)

Graph transformer networks (GTNs) represent a subtype of GNNs that have been infused with the transformer architecture, initially conceived for sequential data like natural language. In the realm of PD research, GTNs have found utility via the adaptation of this architecture to process graph-structured data, manifesting their applicability in diverse domains. Within GTNs, the preservation of the graph structure is achieved through the employment of graph attention mechanisms. These mechanisms empower the network to assign varying degrees of importance to different nodes within the graph, a crucial attribute when dealing with the intricacies of PD research. In this context, where distinct clinical markers and patient-reported outcomes may hold differing degrees of relevance in predicting disease progression, the ability of GTNs to weigh such importance proves especially valuable.
An inherent strength of GTNs in the domain of PD research lies in their proficiency in managing substantial and intricate graph structures. This capability is particularly pertinent when dealing with complex data sources like EHRs and imaging data. By doing so, GTNs effectively capture the intricate interconnections between various data elements, thus furnishing a more comprehensive perspective on disease progression and patient outcomes [54]. Recent strides made in applying GTNs to the diagnosis and treatment of PD have been noteworthy. These advances encompass the utilization of DL models for scrutinizing imaging data and forecasting disease progression. Impressively, these approaches have exhibited considerable potential, yielding commendable accuracy rates in the realms of PD detection and progression prediction. Such advances hold substantial promise in the realm of healthcare systems by potentially ameliorating patient outcomes and mitigating the healthcare burden induced by PD.
It is worth noting that these approaches encompass distinct optimization techniques, with the first relying on the Adam optimizer and the second opting for the L-BFGS optimizer [55]. However, due to the limitations of the baseline implementation, the use of the Adam optimizer becomes a pragmatic choice, albeit necessitating additional hyperparameter fine-tuning to yield optimal results. Despite this requirement for meticulous tuning, the outcomes achieved with the Adam optimizer surpass those achieved through fast neural style transfer [55]. This enhancement can be attributed to the intricate nature of the task involving the creation of a versatile transformer network proficient in accommodating both the style and content signals.

6. Graph Autoencoders (GAEs)

Graph autoencoders (GAEs) constitute a noteworthy facet of GNNs applied to the intricate realm of PD research. Operating on the autoencoder architecture, GAEs are adept at processing graph-structured data, an attribute that aligns with the multifaceted nature of PD studies. Autoencoders, by design, are NNs proficient in the art of reconstructing their input data. They achieve this by encoding the data into a lower-dimensional representation and subsequently decoding it to regain the original form. In the context of GAEs, this input takes the form of a graph, while the lower-dimensional counterpart materializes as a graph embedding. Within the domain of PD research, GAEs prove invaluable in deriving concise representations from intricate graph structures, such as EHRs and imaging data. This proficiency enables GAEs to capture the underlying relationships interconnecting diverse data sources, thereby furnishing a more comprehensive understanding of disease progression and patient outcomes. Additionally, GAEs serve as adept tools for dimensional reduction—an asset of considerable importance in PD research, where handling high-dimensional datasets, notably in the realm of imaging data, poses distinct challenges. By condensing data into lower dimensions, GAEs simplify the process of identifying latent patterns and relationships, which might have eluded detection within the original data representation.
Noteworthy strides have been made in employing GAEs for PD diagnosis and therapeutic interventions. These advancements encompass the utilization of DL models for dissecting imaging data and forecasting disease progression. Encouragingly, these approaches have yielded promising outcomes, harboring substantial potential for enhancing patient well-being and alleviating the healthcare burden associated with PD. In conclusion, GAEs emerge as a vital and adaptable tool within the PD research landscape. Their competence in managing graph-structured data, coupled with their ability to streamline complex datasets through dimensionality reduction, positions them as valuable assets in unraveling the complexities of PD.
Other authors have introduced DD-GCN, a novel method for predicting human splice-site (SL) events using GCNs. DD-GCN employs a two-pronged drop-out strategy, including coarse-grained node dropout and fine-grained edge dropout, to enhance gene embeddings for precise SL prediction [56]. Importantly, DD-GCN exclusively trains on established SL pairs without additional data from external gene sources. Furthermore, the authors present the SLMGAE model, designed for predicting protein–protein interactions (PPIs) by considering multiple perspectives within a protein graph. Using an autoencoder architecture, SLMGAE obtains low-dimensional graph representations for predictive purposes. Experiments confirm SLMGAE’s superior predictive performance compared to other GNN methods and matrix factorization techniques [57,58]. In [59], the authors introduced two innovative Hessian-based SSFS frameworks, denoted as Hessian–Laplacian-based SSFS frameworks using the generalized uncorrelated constraint (HLSFSGU). These frameworks employ mixed-norm (0 < p < 1) regularization for joint sparse feature selection, presenting a novel approach to selecting informative features. Both the HLSFSGU framework and GAEs represent innovative approaches in their respective domains. The HLSFSGU framework introduces a novel method for feature selection by combining Hessian and Laplacian matrices, while GAEs are known for their unique approach to graph-based autoencoding. In summary, GAEs are valuable for analyzing graph-structured data in PD research, offering potential benefits in understanding the disease and improving patient outcomes.

7. Graph Generative Networks (GGNs)

Graph generative networks (GGNs) represent a category of graph-based DL models designed to produce new graph instances resembling a given graph or a collection of graphs. In the context of PD, GGNs can be employed to create realistic depictions of PD symptom progression over time, leveraging available PD patient data. These generated graphs serve as a foundation for predicting future PD symptom developments or gaining insights into the fundamental biological processes driving the disease. By integrating graph-based representations and generative models, GGNs furnish a potent instrument for unraveling the intricate interplay between PD symptoms and the underlying disease pathology, as well as for crafting more efficacious and personalized PD treatments [60].
In the domain of medical image analysis, where labeled images are often limited in availability, augmentation plays a pivotal role, facilitated by the GGNs-based data augmentation technique. GGNs constitute a generative framework comprising two networks: a generator network responsible for crafting synthetic data and a discriminator network tasked with distinguishing real data from synthetic counterparts.

8. Graph Reinforcement Learning Networks (GRLNs)

Graph reinforcement learning networks (GRLNs) are graph-based deep learning models that utilize reinforcement learning algorithms to make decisions based on graph-structured inputs. In the context of PD, GRLNs are valuable for making personalized treatment decisions, including optimizing medication dosages, selecting appropriate physical therapy exercises, and predicting symptom progression. By combining graph-based representations with reinforcement learning, GRLNs improve treatment choices for PD patients, leading to better outcomes. They can also be trained on large datasets to capture complex data patterns and relationships, resulting in more accurate predictions and enhanced treatment strategies.
In [61], a deep reinforcement learning network was developed to predict brain tumor locations. Using 70 post-contrast T1-weighted 2D image slices from the BraTS brain tumor imaging database, a deep Q-network (DQN) was trained to demonstrate reinforcement learning’s practical application in radiology AI. In another instance [57], researchers devised a deep-reinforcement-learning-based method for medical image semantic segmentation. The goal was to reduce human involvement in extracting medical image masks, introducing an advanced version of the deep Q-learning architecture. Although this technique showed promise in selecting optimal masks during image segmentation, the authors noted potential room for improvement in the mask extraction stage in future research. Lastly, in [60], a reinforcement-learning-based recommendation system for antihypertensive medications was proposed for patients with hypertension and type 2 diabetes. This system aimed to enhance precision medicine by integrating electronic health data and machine learning, resulting in the development of a Q-learning model. Table 1 provides an overview of the employment of various GNN techniques across the cited related works. The table offers valuable insights into the adoption of GNNs in different studies, showcasing the diversity of techniques employed for various research objectives.
Table 1. Utilization of GNN techniques in the related work.
Table 2 was constructed to encompass a range of criteria meticulously chosen to provide an all-encompassing characterization of the 12 distinct algorithms and tools employed within the domain of GNNs. The “Availability” criterion indicates whether the respective algorithm or tool is readily accessible to the public and whether it can be employed without financial constraints. The “Data Volume” criterion indicates a pivotal role in gauging the capability of the algorithm or tool to manage substantial data volumes, thereby addressing its aptitude for handling big data. Equally significant is the “Data Variety” criterion, as it discerns the algorithm/tool’s versatility in accommodating various data types and facilitating the integration of heterogeneous data sources. The “Data Velocity” criterion is instrumental in elucidating whether the algorithm/tool is tailored to function seamlessly with streaming data or is designed for static DP. In the healthcare context, “data veracity” alludes to the precision, comprehensiveness, and dependability of healthcare data, spanning data originating from medical imaging, electronic health records, wearable devices, and other sources. Assuring the veracity of healthcare data is a critical imperative, bearing implications for informed decision-making in patient care, treatment strategies, and healthcare policy formulation. The absence of data veracity, characterized by inaccuracies, omissions, or unreliability, can culminate in erroneous diagnoses, ineffective therapeutic interventions, and suboptimal patient outcomes. Hence, the preservation of data veracity assumes paramount importance in the realm of healthcare data management and analytics. Lastly, the “Monitoring/Alerting” criterion indicates significance in appraising whether the algorithm/tool boasts the essential functionality to monitor data and promptly notify patients and healthcare practitioners of any pertinent issues.
Table 2. Evaluation of related works with specific criteria.
Table 2 includes various algorithms, assessing factors like availability, scalability, data handling (volume, variety, velocity), and monitoring/alerting capabilities. For example, MV-GCN is open-source, scalable, and handles homogeneous static data, but lacks monitoring and has low velocity. ST-GCN is similar but with medium velocity. ROI-GCN is a proof-of-concept, scalable, with low velocity and no monitoring. MRI-GCN is open-source and scalable, with low velocity and no monitoring. Other algorithms share similar characteristics. Some excel with heterogeneous data (e.g., multimodal GCN), while others suit homogeneous data (e.g., GRNs for PD diagnosis). The “DMFM based on GATs” is scalable, handles heterogeneous data, and has medium velocity and veracity, but lacks monitoring/alerting. None of the listed algorithms offer monitoring/alerting functionality.

This entry is adapted from the peer-reviewed paper 10.3390/s23218936

Video Production Service