Classification of Sleep Stages Using Telemetry Polysomnography: Comparison
Please note this is a comparison between Version 1 by Utkarsh Lal and Version 3 by Sirius Huang.

Accurate sleep stage detection is crucial for diagnosing sleep disorders and tailoring treatment plans. Polysomnography (PSG) is considered the gold standard for sleep assessment as it captures diverse physiological signals. Recent advancements have shown that simpler machine learning models, when coupled with sophisticated feature extraction techniques, can yield accurate and reliable results comparable to those achieved by complex deep learning methods. These simpler models not only reduce the computational burden but also offer greater interpretability, a feature that is highly valued in clinical settings for both diagnostic and treatment purposes. Therefore, integrating simpler machine learning algorithms with advanced feature extraction can serve as an effective and efficient approach for sleep stage classification in research and clinical applicationince it captures a diverse set of physiological signals.

  • polysomnography
  • electroencephalography
  • electromyography
  • electrooculography
  • Explainable AI
  • Machine Learning
  • Feature Extraction
  • Fractal Dimensions
  • Entropy
  • Sleep Stage Classification
  • Sleep

1. Introduction

On average, people spend a third of their 24 h day sleeping, making it an essential physiological process that significantly impacts an individual’s overall health [1]. Understanding sleep stages and their characteristics is essential for diagnosing and treating sleep disorders, affecting millions worldwide [2][3][2,3]. Sleep can be broadly classified into two primary stages: non-rapid eye movement (NREM) and rapid eye movement (REM) sleep. NREM is further divided into three distinct stages, characterised by different levels of brain activity and physiological responses. The five distinct stages of sleep identified herein are as follows [4]:
  • Wake (W): Marked by low-amplitude, mixed-frequency brain waves, with normal muscle tone and high mentation.
  • Stage 1 (N1): First NREM stage. EEG shows low-voltage and mixed-frequency activity while eye movement and muscle activity begin to decrease.
  • Stage 2 (N2): Marked by sleep spindles and K-complexes in the EEG signal. Muscle tone, heart rate, and eye movement further slow down, and body temperature drops.
  • Stages 3 and 4 (N3/N4): Deepest stage of NREM. Crucial for physical restoration and memory consolidation. This stage is also known as the slow-wave sleep stage. EEG shows high-amplitude and low-frequency delta waves. There is minimal eye movement, and muscle tone is at its lowest during NREM sleep.
  • REM (R): Marked by rapid eye movement, dreaming, and temporary muscle paralysis to prevent physical stimulation from dreams. EEG shows low-amplitude and mixed-frequency activity, which is similar to the N1 stage.
Sleep stages N1 and N2 signify Light Sleep, while N3 and N4 signify Deep Sleep.
The accurate detection of sleep stages plays a pivotal role in both clinical and research settings. Clinically, the comprehensive characterisation of a patient’s sleep architecture, including the distribution and duration of the different sleep stages through the course of the night, is fundamental for diagnosing a diverse set of sleep disorders. For instance, disruptions in sleep stage patterns in a patient are integral diagnostic criteria for insomnia [5][6][7][5,6,7], sleep apnoea [8][9][8,9], and narcolepsy [10][11][10,11]. Moreover, sleep stage classification is not only essential for diagnosing these disorders but also for monitoring the efficacy of various treatments. Sleep stage detection enables clinicians to accurately assess the impact of treatments such as Continuous Positive Airway Pressure (CPAP) for sleep apnoea [12] or Cognitive Behavioural Therapy for insomnia (CBT-I) [13].
In the research setting, accurate sleep stage classification is pivotal for examining sleep’s impact on cognitive functions like learning and memory consolidation. Notably, slow-wave sleep (SWS) is crucial for memory consolidation and synaptic plasticity [14]. Clinically, sleep stage detection can aid in studying sleep’s correlations with ageing and neurodegenerative disorders like Alzheimer’s [15]. Hence, advancing sleep staging can enhance diagnostics and treatments for sleep and neurodegenerative issues.
However, the task of sleep stage detection has traditionally been performed manually by trained specialists analysing polysomnography (PSG) data. Recording these PSG data is a tedious task, wherein the subject has to stay overnight in the lab. Due to the presence of multiple modalities and long recording durations, accurately detecting sleep stages from PSG is a challenging task. On the other hand, modalities like actigraphy [16] present an easy alternative to record data that can be used to detect sleep stages using wearable devices, like wristbands or watches, that measure movement and activity levels over extended periods of time. However, the ease of recording sleep data using actigraphy presents an inherent tradeoff. Actigraphy can lack the degree of precision and reliability of the information captured, making it often less accurate than PSG [17]. Additionally, ECG and heart rate variability (HRV) can also be recorded to identify sleep stages. However, this approach does not provide any information regarding brain activity, which is a crucial aspect of sleep stage analysis [18]. Since the data’s accuracy and consistency are paramount to creating a model that can effectively differentiate between sleep stages, PSG data have been chosen as the modality in this study. With its multimodal nature, PSG captures fine physiological details, like eye and muscle movements and brain wave changes, with high temporal resolution. Furthermore, sleep stage classification with PSG adheres to standardised criteria from systems like the American Academy of Sleep Medicine (AASM) and Rechtschaffen and Kales (R&K), ensuring consistency across different studies and clinical settings.
Due to the extensive nature of polysomnography (PSG) data, which includes hours-long recordings of multiple signals such as electroencephalography (EEG), electromyography (EMG), and electrooculography (EOG), processing this vast amount of data can be challenging. The complexity arises from the numerous sources of voltage fluctuations captured within the recordings, making direct data processing quite demanding. Therefore, feature extraction methods must be employed to accurately capture the subtleties and minute variations in all of these signals in order to uniquely identify different sleep stages. 
Although extracting relevant features is a critical step, having a mechanism that can learn from these features and accurately distinguish between various sleep stages is equally crucial. In recent studies, deep neural networks have been widely employed for feature extraction and classification [19]. However, these neural networks can get highly complex and require longer training times than simpler statistical and machine-learning models. Additionally, neural networks are intrinsically black-box models, which makes it challenging to interpret the results yielded by the model.

2. Methods for Detecting Sleep Stages

Multiple studies have been conducted that employ different methods for detecting sleep stages, ranging from state-of-the-art machine-learning models that distinguish between sleep stages based on features extracted from PSG data to complex deep-learning models [20] designed to work independently without the need for any explicit feature extraction or dimensionality reduction techniques. In one such study, a one-dimensional convolutional neural network (CNN) was developed to detect sleep stages directly from the raw data with high accuracy [21]. Another study [22] implemented a 1-D CNN model to detect cyclic alternating patterns (CAPs) in the EEG data, achieving an accuracy of 90.46% for classifying three-class sleep stages. Studies like [23] proposed a light and efficient deep neural network model based on fractional Fourier transform (FRFT) features derived from the EEG signal, which achieved an accuracy of 81.6% in sleep stage classification. Other studies like [24] developed a novel non-contact sleep structure prediction system (NSSPS) using radio-frequency signals and a convolutional recurrent neural network. This study achieved accuracies ranging from 66 to 83% for classifying various sleep stages. One of the notable contributions to the body of research was the SleepEEGNet [25] presented in 2019. This study utilised the EEG signal and implemented a combination of a Recurrent Neural Network (RNN) sequence-to-sequence model with a CNN for classifying sleep stages. This study used the PhysioNet Sleep-EDF dataset and achieved an overall accuracy of 84.26%. The PhysioNet Sleep-EDF dataset is one of the most widely utilised datasets for sleep analysis. Multiple studies have analysed this dataset for sleep stage classification using neural networks. One such study [26] developed an automated system for sleep stage classification using a deep convolutional long short-term neural network (CNN-LSTM). Another similar study [27] on the same dataset employed a CNN-LSTM model for sleep stage classification using a single-channel Fpz-Cz EEG channel and achieved an accuracy of 84.19%. Single-channel EEG is one of the most popular choices of modalities for sleep stage analysis, and multiple studies, like [28], have been conducted for sleep stage classification. However, the utilisation of EMG and EOG data along with EEG using a diverse spectrum of feature extraction measures, like DFA, entropy, fractal dimensions, PSD, and statistical measures, has not been deeply explored yet. Deep-learning models achieve high accuracy; however, such models often take a long time to train and test due to the sheer volume of PSG data, which are often recorded over long durations for each subject. In order to reduce the complexity and increase the efficiency of a predictive model, there is a need to extract the salient features from PSG data that capture the nuances and principal characteristics that mark the differences between the various sleep stages without any loss of information. Numerous methods have been implemented in previous research to efficiently extract features by using measures such as fractal dimensions [29], entropy measures [30], wavelet transforms [31], and power spectral density [32]. Although PSG data are one of the most widely accepted modalities for sleep analysis, recent technological developments have opened up new avenues that are much easier for the same task. With the surge of sleep-tracking devices, such as smart watches, smart rings, and other actigraphy-based devices, a new phase of sleep stage detection research is underway. Many studies have been conducted to compare the performance of the popular consumer-grade product Oura Ring with PSG data when recorded simultaneously. In one such study, sleep-onset latency (SOL), total sleep time (TST), and wake after onset (WASO) were computed and compared from the recordings of both PSG and Oura Ring [33]. Multiple discrepancies were observed between PSG and Oura, indicating a need for further enhancement of such consumer products in order to increase their overall accuracy. In a study bearing significant similarity, physiological data gathered through both the Oura Ring and polysomnography (PSG) were employed in an examination of various sleep-related factors. Specifically, the study focused on exploring the influence of peripheral signals mediated by the autonomic nervous system (ANS), circadian characteristics, and accelerometer data on sleep stage detection [34]. The Oura Ring includes a triaxial accelerometer, a negative temperature coefficient (NTC) thermistor as a temperature sensor, and an infrared photodetector that measures heart rate variability (HRV). The research indicated that combining the small size of wearable ring technology, multidimensional biological data streams, and effective artificial intelligence algorithms can result in notable precision in discerning sleep stages. Apart from wearable rings, wristwatch-type sensing devices have also been employed for sleep stage detection in previous studies. In one such study, a combination of a reflective photoelectric volume pulse sensor and a triaxial accelerometer was utilised for sleep quality assessment and compared with PSG data, recorded simultaneously [35]. An analysis of pulse-to-pulse (PPI) and body movement indexes derived from the physiological signals recorded by the wristwatch sensor were used to develop an automated sleep stage classification system. In other studies, novel approaches like non-contact radar technology have also been implemented to accurately distinguish between sleep stages [36]. While many studies utilise multiple modalities of data and multidimensional biometric streams, as discussed previously, some studies focus on heart rate variability (HRV) for the classification of sleep stages and even other disorders. In one such study, detrended fluctuation analysis (DFA) and spectral analysis were employed to quantify cyclical variation related to the heart to investigate the effect of sleep stages and sleep apnea on HRV [37]. Similarly, another study utilises the Firstbeat sleep analysis method, which is based on measurements derived from HRV and accelerometer data [38].  

3. The Significance of Simpler Machine Learning Models with Sophisticated Feature Extraction in Sleep Stage Classification

The marriage of simpler machine learning models with advanced feature extraction methods has been a game-changer in the field of sleep stage classification. While deep learning algorithms like CNNs and LSTMs achieve high accuracies, they often suffer from extensive computational overhead and can be opaque in terms of interpretability. Simpler machine learning models, such as Support Vector Machines (SVMs) or Random Forests, when coupled with sophisticated feature extraction techniques, present a compelling alternative. They not only yield comparable performance but also provide increased interpretability, which is pivotal for clinical neuroscience applications.

Balanced Efficiency and Accuracy

One of the most notable benefits is the balance of efficiency and accuracy that these models offer. Traditional polysomnography (PSG) data are complex and abundant, requiring significant computational resources for data analysis. In a clinical setting, where time is often of the essence, the quick and accurate categorization of sleep stages can be crucial for immediate intervention or diagnosis. Feature extraction methods like wavelet transform, fractal dimensions, or entropy measures streamline the essential characteristics of the data, facilitating faster training and prediction times with simpler machine learning models.

Interpretability and Transparency

The "black-box" nature of deep learning models has been a point of concern in medical applications, where understanding the reasoning behind a classification can be as important as the classification itself. In contrast, simpler models are often easier to interpret. This transparency is invaluable in clinical settings for both validating the model and for clinicians to trust the machine's output when making diagnostic or treatment decisions. Moreover, understanding which features are most indicative of specific sleep stages can also contribute to our broader scientific understanding of sleep architecture and its physiological correlates.

Flexibility in Data Integration

Advanced feature extraction techniques allow for the integration of data from multiple sources or modalities. For example, while EEG is commonly used, incorporating additional metrics like electromyography (EMG) or electrooculography (EOG) can provide a more comprehensive understanding of sleep stages. Sophisticated feature extraction can harmonize these disparate data types, making it easier to incorporate them into simpler machine learning models. This results in more robust and accurate classification systems that can be tailored to meet the specific needs of different clinical scenarios.

Resource-Efficiency

Another critical advantage is resource-efficiency. Deep learning algorithms often require high-end computational hardware, which may not be accessible in many healthcare settings, particularly in low-resource environments. In contrast, simpler machine learning algorithms, coupled with feature extraction techniques, can run on less powerful machines without compromising the quality of sleep stage classification.

Applications in Neurodegenerative Disorders and Aging

Advanced feature extraction can also help in investigating the relationship between sleep patterns and neurodegenerative diseases like Alzheimer’s and Parkinson's. By identifying specific EEG markers or other physiological indicators through feature extraction, simpler models can be used for early detection and monitoring of these disorders. This is of paramount importance in clinical neuroscience, opening the door for preventive strategies and new avenues for treatment.

4. Closing Remarks

In summary, the combination of simpler machine learning models with advanced feature extraction methods offers a balanced, interpretable, and resource-efficient approach for sleep stage classification. These methodologies are not only vital for immediate clinical applications but also offer broader insights that could be seminal in the study of sleep’s impact on various cognitive functions, aging, and neurodegenerative diseases. As our understanding of sleep continues to grow, these techniques are poised to play a crucial role in the evolution of both clinical neuroscience and sleep medicine.

ScholarVision Creations