8 research outputs found
Appearance-Based Gaze Estimation in the Wild
Appearance-based gaze estimation is believed to work well in real-world
settings, but existing datasets have been collected under controlled laboratory
conditions and methods have been not evaluated across multiple datasets. In
this work we study appearance-based gaze estimation in the wild. We present the
MPIIGaze dataset that contains 213,659 images we collected from 15 participants
during natural everyday laptop use over more than three months. Our dataset is
significantly more variable than existing ones with respect to appearance and
illumination. We also present a method for in-the-wild appearance-based gaze
estimation using multimodal convolutional neural networks that significantly
outperforms state-of-the art methods in the most challenging cross-dataset
evaluation. We present an extensive evaluation of several state-of-the-art
image-based gaze estimation algorithms on three current datasets, including our
own. This evaluation provides clear insights and allows us to identify key
research challenges of gaze estimation in the wild
Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Conventional feature-based and model-based gaze estimation methods have
proven to perform well in settings with controlled illumination and specialized
cameras. In unconstrained real-world settings, however, such methods are
surpassed by recent appearance-based methods due to difficulties in modeling
factors such as illumination changes and other visual artifacts. We present a
novel learning-based method for eye region landmark localization that enables
conventional methods to be competitive to latest appearance-based methods.
Despite having been trained exclusively on synthetic data, our method exceeds
the state of the art for iris localization and eye shape registration on
real-world imagery. We then use the detected landmarks as input to iterative
model-fitting and lightweight learning-based gaze estimation methods. Our
approach outperforms existing model-fitting and appearance-based methods in the
context of person-independent and personalized gaze estimation
Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning
We propose a novel neural pipeline, MSGazeNet, that learns gaze
representations by taking advantage of the eye anatomy information through a
multistream framework. Our proposed solution comprises two components, first a
network for isolating anatomical eye regions, and a second network for
multistream gaze estimation. The eye region isolation is performed with a U-Net
style network which we train using a synthetic dataset that contains eye region
masks for the visible eyeball and the iris region. The synthetic dataset used
in this stage is procured using the UnityEyes simulator, and consists of 80,000
eye images. Successive to training, the eye region isolation network is then
transferred to the real domain for generating masks for the real-world eye
images. In order to successfully make the transfer, we exploit domain
randomization in the training process, which allows for the synthetic images to
benefit from a larger variance with the help of augmentations that resemble
artifacts. The generated eye region masks along with the raw eye images are
then used together as a multistream input to our gaze estimation network, which
consists of wide residual blocks. The output embeddings from these encoders are
fused in the channel dimension before feeding into the gaze regression layers.
We evaluate our framework on three gaze estimation datasets and achieve strong
performances. Our method surpasses the state-of-the-art by 7.57% and 1.85% on
two datasets, and obtains competitive results on the other. We also study the
robustness of our method with respect to the noise in the data and demonstrate
that our model is less sensitive to noisy data. Lastly, we perform a variety of
experiments including ablation studies to evaluate the contribution of
different components and design choices in our solution.Comment: 15 pages, 7 figures, 14 tables. This work has been accepted to the
IEEE Transactions on Artificial Intelligence 2024 IEEE. Personal
use of this material is permitted. Permission from IEEE must be obtained for
all other use
Unobtrusive and pervasive video-based eye-gaze tracking
Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe
Building a Self-Learning Eye Gaze Model from User Interaction Data
2014 ACM Conference on Multimedia, MM 2014, 3-7 November 2014Most eye gaze estimation systems rely on explicit calibration, which is inconvenient to the user, limits the amount of possible training data and consequently the performance. Since there is likely a strong correlation between gaze and interaction cues, such as cursor and caret locations, a supervised learning algorithm can learn the complex mapping between gaze features and the gaze point by training on incremental data collected implicitly from normal computer interactions. We develop a set of robust geometric gaze features and a corresponding data validation mechanism that identifies good training data from noisy interaction-informed data collected in real-use scenarios. Based on a study of gaze movement patterns, we apply behavior-informed validation to extract gaze features that correspond with the interaction cue, and data-driven validation provides another level of crosschecking using previous good data. Experimental evaluation shows that the proposed method achieves an average error of 4.06°, and demonstrates the effectiveness of the proposed gaze estimation method and corresponding validation mechanism.Department of ComputingRefereed conference pape
Gaze estimation and interaction in real-world environments
Human eye gaze has been widely used in human-computer interaction, as it is a promising modality for natural, fast, pervasive, and non-verbal interaction between humans and computers. As the foundation of gaze-related interactions, gaze estimation has been a hot research topic in recent decades. In this thesis, we focus on developing appearance-based gaze estimation methods and corresponding attentive user interfaces with a single webcam for challenging real-world environments. First, we collect a large-scale gaze estimation dataset, MPIIGaze, the first of its kind, outside of controlled laboratory conditions. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Second, we propose an appearance-based method that, in stark contrast to a long-standing tradition in gaze estimation, only takes the full face image as input. Third, we study data normalisation for the first time in a principled way, and propose a modification that yields significant performance improvements. Fourth, we contribute an unsupervised detector for human-human and human-object eye contact. Finally, we study personal gaze estimation with multiple personal devices, such as mobile phones, tablets, and laptops.Der Blick des menschlichen Auges wird in Mensch-Computer-Interaktionen verbreitet eingesetzt, da dies eine vielversprechende Möglichkeit fĂŒr natĂŒrliche, schnelle, allgegenwĂ€rtige und nonverbale Interaktion zwischen Mensch und Computer ist. Als Grundlage von blickbezogenen Interaktionen ist die BlickschĂ€tzung in den letzten Jahrzehnten ein wichtiges Forschungsthema geworden. In dieser Arbeit konzentrieren wir uns auf die Entwicklung Erscheinungsbild-basierter Methoden zur BlickschĂ€tzung und entsprechender âattentive user interfacesâ (die Aufmerksamkeit des Benutzers einbeziehende Benutzerschnittstellen) mit nur einer Webcam fĂŒr anspruchsvolle natĂŒrliche Umgebungen. ZunĂ€chst sammeln wir einen umfangreichen Datensatz zur BlickschĂ€tzung, MPIIGaze, der erste, der auĂerhalb von kontrollierten Laborbedingungen erstellt wurde. Zweitens schlagen wir eine Erscheinungsbild-basierte Methode vor, die im Gegensatz zur langjĂ€hrigen Tradition in der BlickschĂ€tzung nur eine vollstĂ€ndige Aufnahme des Gesichtes als Eingabe verwendet. Drittens untersuchen wir die Datennormalisierung erstmals grundsĂ€tzlich und schlagen eine Modifizierung vor, die zu signifikanten Leistungsverbesserungen fĂŒhrt. Viertens stellen wir einen unĂŒberwachten Detektor fĂŒr Augenkontakte zwischen Mensch und Mensch und zwischen Mensch und Objekt vor. AbschlieĂend untersuchen wir die persönliche BlickschĂ€tzung mit mehreren persönlichen GerĂ€ten wie Handy, Tablet und Laptop