1,982 research outputs found

    Appearance-Based Gaze Estimation in the Wild

    Full text link
    Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

    I Spy With My Little Eyes: A Convolutional Deep Learning Approach to Web Eye Tracking

    Get PDF
    Eye-tracking is the study of eye movements, blinks, fixations, and is aiming to give insight into visual attention mechanisms. Being a common in marketing, usability research, as well as in cognitive science, there are well stablished methods for lab eye tracking, yet web eye tracking uses webcams of much lower quality. Web eye tracking can provide valuable information about users’ engagement with digital content from the comfort of their own home. This gives designers, developers, and researchers the chance to inform their decisions from data and optimize e.g. user experience while connecting to large and demographically diverse samples without the necessity for lab-level equipment. For web eye tracking, only limited tools exist that are accompanied with uncertainties which need to be addressed before using these tools for scientific research. Improving the quality of data collected via such channels is also part of this goal. The project aims to develop a reliable deep learning solution such as a convolutional neural network capable of predicting gaze x/y screen coordinates from the webcam video of users. The predictions of the proposed methods are compared to baselines models that use webcam data and to predictions made by the lab eye tracker.Eye-tracking is the study of eye movements, blinks, fixations, and is aiming to give insight into visual attention mechanisms. Being a common in marketing, usability research, as well as in cognitive science, there are well stablished methods for lab eye tracking, yet web eye tracking uses webcams of much lower quality. Web eye tracking can provide valuable information about users’ engagement with digital content from the comfort of their own home. This gives designers, developers, and researchers the chance to inform their decisions from data and optimize e.g. user experience while connecting to large and demographically diverse samples without the necessity for lab-level equipment. For web eye tracking, only limited tools exist that are accompanied with uncertainties which need to be addressed before using these tools for scientific research. Improving the quality of data collected via such channels is also part of this goal. The project aims to develop a reliable deep learning solution such as a convolutional neural network capable of predicting gaze x/y screen coordinates from the webcam video of users. The predictions of the proposed methods are compared to baselines models that use webcam data and to predictions made by the lab eye tracker

    Robust Eye Tracking Based on Adaptive Fusion of Multiple Cameras

    Get PDF
    Eye and gaze movements play an essential role in identifying individuals' emotional states, cognitive activities, interests, and attention among other behavioral traits. Besides, they are natural, fast, and implicitly reflect the targets of interest, which makes them a highly valuable input modality in human-computer interfaces. Therefore, tracking gaze movements, in other words, eye tracking is of great interest to a large number of disciplines, including human behaviour research, neuroscience, medicine, and human-computer interaction. Tracking gaze movements accurately is a challenging task, especially under unconstrained conditions. Over the last two decades, significant advances have been made in improving the gaze estimation accuracy. However, these improvements have been achieved mostly under controlled settings. Meanwhile, several concerns have arisen, such as the complexity, inflexibility and cost of the setups, increased user effort, and high sensitivity to varying real-world conditions. Despite various attempts and promising enhancements, existing eye tracking systems are still inadequate to overcome most of these concerns, which prevent them from being widely used. In this thesis, we revisit these concerns and introduce a novel multi-camera eye tracking framework. The proposed framework achieves a high estimation accuracy while requiring a minimal user effort and a non-intrusive flexible setup. In addition, it provides improved robustness to large head movements, illumination changes, use of eye wear, and eye type variations across users. We develop a novel real-time gaze estimation framework based on adaptive fusion of multiple single-camera systems, in which the gaze estimation relies on projective geometry. Besides, to ease the user calibration procedure, we investigate several methods to model the subject-specific estimation bias, and consequently, propose a novel approach based on weighted regularized least squares regression. The proposed method provides a better calibration modeling than state-of-the-art methods, particularly when using low-resolution and limited calibration data. Being able to operate with low-resolution data also enables to utilize a large field-of-view setup, so that large head movements are allowed. To address aforementioned robustness concerns, we propose to leverage multiple eye appearances simultaneously acquired from various views. In comparison with conventional single view approach, the main benefit of our approach is to more reliably detect gaze features under challenging conditions, especially when they are obstructed due to large head pose or movements, or eye glasses effects. We further propose an adaptive fusion mechanism to effectively combine the gaze outputs obtained from multi-view appearances. To this effect, our mechanism firstly determines the estimation reliability of each gaze output and then performs a reliability-based weighted fusion to compute the overall point of regard. In addition, to address illumination and eye type robustness, the setup is built upon active illumination and robust feature detection methods are developed. The proposed framework and methods are validated through extensive simulations and user experiments featuring 20 subjects. The results demonstrate that our framework provides not only a significant improvement in gaze estimation accuracy but also a notable robustness to real-world conditions, making it suitable for a large spectrum of applications
    • …
    corecore