ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Detection and Correspondence Matching of Corneal Reflections for Eye Tracking Using Deep Learning

Soumil Chugh, Braiden Brousseau, Jonathan Rose, Moshe Eizenman

Auto-TLDR; A Fully Convolutional Neural Network for Corneal Reflection Detection and Matching in Extended Reality Eye Tracking Systems

Abstract Slides Poster

Eye tracking systems that estimate the point-of-gaze are essential in extended reality (XR) systems as they enable new interaction paradigms and technological improvements. It is important for these systems to maintain accuracy when the headset moves relative to the head (known as device slippage) due to head movements or user adjustment. One of the most accurate eye tracking techniques, which is also insensitive to shifts of the system relative to the head, uses two or more infrared (IR) light emitting diodes to illuminate the eye and an IR camera to capture images of the eye. An essential step in estimating the point-of-gaze in these systems is the precise determination of the location of two or more corneal reflections (virtual images of the IR-LEDs that illuminate the eye) in images of the eye. Eye trackers tend to have multiple light sources to ensure at least one pair of reflections for each gaze position. The use of multiple light sources introduces a difficult problem: the need to match the corneal reflections with the corresponding light source over the range of expected eye movements. Corneal reflection detection and matching often fail in XR systems due to the proximity of camera and steep illumination angles of light sources with respect to the eye. The failures are caused by corneal reflections having varying shape and intensity levels or disappearance due to rotation of the eye, or the presence of spurious reflections. We have developed a fully convolutional neural network, based on the UNET architecture, that solves the detection and matching problem in the presence of spurious and missing reflections. Eye images of 25 people were collected in a virtual reality headset using a binocular eye tracking module consisting of five infrared light sources per eye. A set of 4,000 eye images were manually labelled for each of the corneal reflections, and data augmentation was used to generate a dataset of 40,000 images. The network is able to correctly identify and match 91% of corneal reflections present in the test set. This is comparable to a state-of-the-art deep learning system, but our approach requires 33 times less memory and executes 10 times faster. The proposed algorithm, when used in an eye tracker in a VR system, achieved an average mean absolute gaze error of 1°. This is a significant improvement over the state-of-the-art learning-based XR eye tracking systems that have reported gaze errors of 2-3°.

Similar papers

User-Independent Gaze Estimation by Extracting Pupil Parameter and Its Mapping to the Gaze Angle

Sang Yoon Han, Nam Ik Cho

Auto-TLDR; Gaze Point Estimation using Pupil Shape for Generalization

Detection and Correspondence Matching of Corneal Reflections for Eye Tracking Using Deep Learning

Similar papers

User-Independent Gaze Estimation by Extracting Pupil Parameter and Its Mapping to the Gaze Angle

Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction

Adaptive Feature Fusion Network for Gaze Tracking in Mobile Tablets

Estimating Gaze Points from Facial Landmarks by a Remote Spherical Camera

Classifying Eye-Tracking Data Using Saliency Maps

Ghost Target Detection in 3D Radar Data Using Point Cloud Based Deep Neural Network

Collaborative Human Machine Attention Module for Character Recognition

A General End-To-End Method for Characterizing Neuropsychiatric Disorders Using Free-Viewing Visual Scanning Tasks

Tracking Fast Moving Objects by Segmentation Network

Transfer Learning through Weighted Loss Function and Group Normalization for Vessel Segmentation from Retinal Images

Early Wildfire Smoke Detection in Videos

Explainable Online Validation of Machine Learning Models for Practical Applications

DE-Net: Dilated Encoder Network for Automated Tongue Segmentation

Automatic Semantic Segmentation of Structural Elements related to the Spinal Cord in the Lumbar Region by Using Convolutional Neural Networks

Point In: Counting Trees with Weakly Supervised Segmentation Network

Holistic Grid Fusion Based Stop Line Estimation

GazeMAE: General Representations of Eye Movements Using a Micro-Macro Autoencoder

RescueNet: Joint Building Segmentation and Damage Assessment from Satellite Imagery

Weight Estimation from an RGB-D Camera in Top-View Configuration

Uncertainty Guided Recognition of Tiny Craters on the Moon

Street-Map Based Validation of Semantic Segmentation in Autonomous Driving

A Comparison of Neural Network Approaches for Melanoma Classification

FastSal: A Computationally Efficient Network for Visual Saliency Prediction

Planar 3D Transfer Learning for End to End Unimodal MRI Unbalanced Data Segmentation

A Lumen Segmentation Method in Ureteroscopy Images Based on a Deep Residual U-Net Architecture

A Versatile Crack Inspection Portable System Based on Classifier Ensemble and Controlled Illumination

Thermal Image Enhancement Using Generative Adversarial Network for Pedestrian Detection

Learning to Segment Clustered Amoeboid Cells from Brightfield Microscopy Via Multi-Task Learning with Adaptive Weight Selection

RISEdb: A Novel Indoor Localization Dataset

Triplet-Path Dilated Network for Detection and Segmentation of General Pathological Images

Light3DPose: Real-Time Multi-Person 3D Pose Estimation from Multiple Views

Inner Eye Canthus Localization for Human Body Temperature Screening

Radar Image Reconstruction from Raw ADC Data Using Parametric Variational Autoencoder with Domain Adaptation

RefiNet: 3D Human Pose Refinement with Depth Maps

Motion U-Net: Multi-Cue Encoder-Decoder Network for Motion Segmentation

Exposing Deepfake Videos by Tracking Eye Movements

FC-DCNN: A Densely Connected Neural Network for Stereo Estimation

Real-Time Monocular Depth Estimation with Extremely Light-Weight Neural Network

SynDHN: Multi-Object Fish Tracker Trained on Synthetic Underwater Videos

Aerial Road Segmentation in the Presence of Topological Label Noise

IPT: A Dataset for Identity Preserved Tracking in Closed Domains

Real Time Fencing Move Classification and Detection at Touch Time During a Fencing Match

Hybrid Approach for 3D Head Reconstruction: Using Neural Networks and Visual Geometry

FOANet: A Focus of Attention Network with Application to Myocardium Segmentation

A Fine-Grained Dataset and Its Efficient Semantic Segmentation for Unstructured Driving Scenarios

Facial Expression Recognition Using Residual Masking Network

Extending Single Beam Lidar to Full Resolution by Fusing with Single Image Depth Estimation

Detecting Manipulated Facial Videos: A Time Series Solution