Eye tracking is a technique for detecting and measuring eye movements and characteristics. An eye tracker can sense a person’s gaze locations and features at a certain frequency.
1. Introduction
Eye tracking is a technique for detecting and measuring eye movements and characteristics
[1]. An
eye tracker can sense a person’s gaze locations at a certain frequency. Finding gaze position allows identification of
fixations and
saccades. Fixations, which typically last between 100 and 600 ms
[2][3], are time periods during which the eyes are almost still, with the gaze being focused on a specific element of the scene. On the other hand, saccades, which normally last less than 100 ms
[3], are very fast eye movements occurring between consecutive pairs of fixations, with the purpose of relocating the gaze on a different element in the visual scene.
2. Eye Tracking Technology
Electro-oculography, scleral contact lens/search coil, photo-oculography, video-oculography and pupil center-corneal reflection are some of the eye-tracking technologies that have been developed over time
[1].
Electro-oculography (EOG), one of the oldest methods to record eye movements, measures the skin’s electrical potential differences through small electrodes placed around the eyes
[4]. This solution allows recording of eye movements even when the eyes are closed, but is generally more invasive and less accurate and precise than other approaches.
Scleral contact lens/search coil is another old method consisting of small coils of wire inserted in special contact lenses. The user’s head is then placed inside a magnetic field to generate an electrical potential that allows estimation of eye position
[5]. While this technique has a very high spatial and temporal resolution, it is also extremely invasive and uncomfortable, used practically only for physiological studies.
Photo- and video-oculography (POG and VOG) are generally video-based methods in which small cameras, incorporated in head-mounted devices, measure eye features such as pupil size, iris–sclera boundaries and possible corneal reflections. The assessment of these characteristics can occur both automatically and manually. However, these systems tend to be inaccurate and are mainly used for medical purposes
[1].
Pupil center-corneal reflection (PCCR) is the most used eye tracking technique nowadays. Its basic principle consists of using infrared (or near-infrared) light sources to illuminate the eyes and detect reflections on their surface (
Figure 1); this allows determination of the gaze direction
[1]. Infrared light is employed because it is invisible and also produces a better contrast between pupil and iris. The prices of these eye trackers range from a few hundreds to tens of thousands of euros, depending on their accuracy and gaze sampling frequency. All the works analyzed in the present review employ this technology.
Figure 1. Example of eye detection with the Gazepoint GP3 HD eye tracker: above, the eyes detected within the face; below, pupil/corneal reflections.
There are two main kinds of eye trackers, namely, remote and wearable. Remote eye trackers (Figure 2, left) are normally non-intrusive devices (often little “bars”) that are positioned at the bottom of standard displays. They are currently the most prevalent kind of eye trackers. Wearable eye trackers (Figure 2, right), on the other hand, are frequently used to study viewing behavior in real-world settings. Recent wearable eye trackers look more and more like glasses, making them much more comfortable than in the past.
Figure 2. Examples of remote and wearable eye trackers. On the left, highlighted in red, a Tobii 4c (by Tobii) remote device; on the right, a PupilCore (by Pupil Labs) wearable tool.
Psychology
[6], neuroscience
[7], marketing
[8], education
[9], usability
[10][11] and biometrics
[12][13] are all fields in which eye tracking technology has been applied, for instance, to determine the user’s gaze path while looking at something (e.g., an image or a web page) or to obtain information about the screen regions that are most frequently inspected. When using an eye tracker as an input tool (i.e., for interactive purposes, in an explicit way), gaze data must be evaluated in real-time, so that the computer can respond to specific gaze behaviors
[14]. Gaze input is also extremely beneficial as an assistive technology for people who are unable to use their hands. Several assistive solutions have been devised to date, including those for writing
[15][16][17], surfing the Web
[18][19] and playing music
[20][21].
Two common ways to provide gaze input are through
dwell time and
gaze gestures. Dwell time, which is the most used approach, consists of fixating a target element (e.g., a button) for a certain time (the dwell time), after which an action connected to that element is triggered. The duration of the dwell time can vary depending on the application, but it should be chosen so as to avoid the so called “Midas touch problem”
[22], i.e., involuntary selections occurring when simply looking at the elements of an interface.
Gaze gestures consist of gaze paths performed by the user to trigger specific actions. This approach can be fast and is immune to the Midas touch problem, but it is also generally less intuitive than the dwell time (since the user needs to memorize a set of gaze gestures, it may have a steep learning curve). For this reason, gaze gestures are recommended only for applications meant to be used multiple times, such as writing systems (e.g.,
[23][24]).
Hybrid approaches that mix dwell time and gaze gestures (e.g., for interacting with video games
[25]) have also been proposed, while other gaze input methods (such as those based on blinks
[26] or smooth pursuit
[17]) are currently less common.