2. EOG
Eye-movement detection refers to the process of detecting and measuring eye movements. Eye movements provide important information regarding the attention, perception, and cognitive processes of participants by analyzing saccades, fixations, and smooth pursuit. Eye-movement detection can be applied to the study of visual and auditory processing, learning and memory, and other behavioral aspects of human performance. Eye-movement detection can be achieved using several techniques, including EOG, infrared (IR) eye trackers, video-based eye trackers, and dual-purpose tracking.
This is a non-invasive method that measures the electrical potential difference between the cornea and retina to detect eye movements. This difference arises because the cornea is positively charged with respect to the retina and the overall potential difference changes when the eye moves. The electrical signal generated by eye movements can be used to detect the direction and magnitude of eye movements and can also be applied to study the underlying physiological processes with respect to eye-movement control. EOG is widely applied in psychology, neuroscience, and ophthalmology for visual perception attention and cognitive processes. Moreover, EOG can be applied to human–computer interactions, where eye movements control the computer cursor or interact with graphical interfaces.
A few studies
[12,13,14][11][12][13] have focused on the EOG technique, in which the authors focused on eye-movement-related research. In
[12][11], an overview of the history, basic principles, and applications of EOG is provided, together with some limitations of the EOG technique for eye-movement measurements. In
[13][12], the development of a new EOG recording system and its application to the study of eye movements are described. In
[14][13], a new low-noise amplifier for EOG signals is presented and its performance is compared with those of other amplifier designs. They evaluate the performance of the amplifier in terms of its noise, bandwidth, and distortion and then compare it with other amplifier designs that are commonly utilized for EOG signals. The experimental results indicated that the proposed amplifier design provided better performance than existing designs in terms of a lower noise, a wider bandwidth, and lower distortion rates. The origin of the EOG as a technique to measure eye movements can be traced back to several pioneering studies. In 1950, Harold Hoffman first developed the EOG system and used it to study eye movements and visual perception. Steiner extended Hoffman’s work and developed a more sophisticated EOG system capable of measuring either horizontal or vertical movements. In the 1960s, Bowen et al. developed a mathematical model of EOG that enabled a more accurate and detailed description of the electrical signals generated by eye movements. Currently, EOG is widely used in several fields and plays a crucial role in advancing our knowledge of visual perception, attention, and cognitive processes.
3. Infrared (IR) Eye Tracking
IR eye-tracking techniques involve the use of infrared light to illuminate the eyes and a camera to capture the reflection of light in the eyes. IR works by using an infrared light source in the eyes and capturing the reflection using a camera. IR eye tracking is widely used in fields such as psychology, HCI, market research, and gaming. This is because IR eye-tracking techniques are non-intrusive and can be used with a wide range of subjects such as those wearing glasses and contact lenses
[15][14].
4. Video-Based Eye Tracking
In computer-vision-based eye tracking and eye detection from an input image, localization is considered to be the main area. These two areas are challenging because there are several issues associated with eye detection and tracking such as the degree of eye openness or the variability of eye sizes in target objects. Computer-vision-based methods use video cameras to capture images of the eyes and analyze them to determine eye movements. Cameras can be set up to track eye movements in real-time or in laboratory settings. The basic idea of this method is to use image processing techniques to detect and track the position of the pupils in a video stream and then use the acquired information to infer the direction of gaze. Generally, there are two types of video-based eye-tracking systems, remote and wearable. Regarding the remote-tracking type, eye tracking uses a camera placed at a certain distance from the participant and then records their eye movements at the same time. In the wearable eye-tracking type, a camera is attached to a headset or glasses that the participant wears and then records the participant’s eye movements from a relatively greater proximity. The benefit of this technique is that a more complete picture of the gaze behavior is obtained. In recent years, several researchers
[14,15,16,17,18][13][14][15][16][17] have discussed video-based eye-tracking movements by analyzing eye recordings. In
[16][15], a method for real-time video-based eye tracking using deep learning algorithms that combined CNNs and RNNs to accurately track eye positions in real time was proposed. The authors evaluated their system using a publicly available dataset. The experimental results showed that the proposed method outperformed traditional approaches in terms of accuracy, speed, and robustness. Eye-tracking pattern recognition has achieved remarkable results. The authors of
[19][18] proposed a principal component analysis (PCA) model to identify six principal components. The identified components were used to reduce the dimensionality of the image pixels. The authors used an ANN to classify pupil positions where calibration was required to observe five varied points that all represented different pupil positions. Lui et al.
[20][19] presented both eye-detection and -tracking methods. Researchers have applied the Viola–Jones face detector Haar features to locate the face in an input image. The template matching (TM) method was used for eye detection. TM is widely used in computer vision because it can be applied to object recognition, image registration, and motion analysis. For eye-detection purposes, a similarity measure was computed based on metrics such as cross-correlation, mean-squared error, and normalized correlation. The authors of
[20][19] used different methods for eye detection and tracking such as Zernike moments (ZMs) to extract the rotation invariant of eye characteristics and an SVM for eye or non-eye classification.
5. Dual-Purpose Tracking
This method combines infrared and video-based tracking techniques to improve the accuracy of eye-movement detection. Infrared tracking provides highly accurate eye positions, whereas video-based tracking provides additional information regarding eye appearance and motion. Huang et al.
[21][20] presented an algorithm to detect eye pupils based on eye intensity, size, and shape. In this study, the intensity of the eye pupil was applied as the main feature in pupil detection and the SVM identified the location of the eye. In pupil-fitting, corneal reflection and energy-controlled iterative curve-fitting methods are efficient approaches, as reported by Li and Wee
[22][21]. For pupil boundary detection, an ellipse-fitting algorithm can be used, which is controlled by the energy function. In the ellipse-fitting process, the task is to find the best fit for a given data point and to minimize the distance between the input data points and the sum of the squared distances.
6. Yawning-Based Drowsiness Detection
Yawning is a physical reflex lasting 4–7 s of gradual mouth gaping and a rapid expiratory phase with muscle relaxation
[23][22]. As it is a natural physiological response to tiredness, it is widely used in research to identify drowsy drivers. The authors of
[24][23] proposed a method based on tracking the condition of the driver’s mouth and recognizing the yawning state. They implemented a cascade-boosted classifier for Haar wavelet features on several different scales if those positions were measured by Canny integral images. The AdaBoost algorithm was used for feature selection and localization. To determine the yawning condition of the driver, an SVM was applied to the model prediction in the case of the data instances for model testing. For this, the SVM was trained using mouth- and yawning-related images by transforming the data and scaling was performed on the data by applying the radial function kernel. In
[25][24], the researchers proposed a yawning detection approach that included measurements of eye-closure duration. They calculated the eye state and yawning coordinate measurements based on mouth conditions. For mouth detection,
[25][24] used a spatial fuzzy c-means (s-FCM) clustering method.
The authors also applied an SVM for drowsiness detection; the input of the SVM was the width-to-height ratios of the eyes and mouth. Calculations were performed based on the state of the eye and mouth of the driver such as whether the eye was half-open or closed or yawning or not. The final classification results were used to determine whether the driver was in a dangerous condition. Ying et al.
[26][25] determined driver fatigue or drowsiness by monitoring variances in eye and mouth positions. This method relied on skin-color extraction. To find the state of the moving object, a back propagation (BP) neural network was required that enabled the recognition of the object’s position
[27][26]. A similar approach by Wang et al.
[28][27] mentioned that the mouth region was located in terms of multi-threshold binarization in intensity space using the Gaussian model in the range of RGB color space. By applying the lip corners to the integral projection of the mouth in the vertical direction, the lower and upper lip boundaries were highlighted as the openness of the mouth. Therefore, the yawning stage was determined by the degree of mouth opening, with respect to the ratio of the mouth–bounding rectangle. When the ratio in the box identified a large open mouth over a predefined threshold for a continuous number of frames, the driver was classified as drowsy.