13 research outputs found

    Benefits of temporal information for appearance-based gaze estimation

    Full text link
    State-of-the-art appearance-based gaze estimation methods, usually based on deep learning techniques, mainly rely on static features. However, temporal trace of eye gaze contains useful information for estimating a given gaze point. For example, approaches leveraging sequential eye gaze information when applied to remote or low-resolution image scenarios with off-the-shelf cameras are showing promising results. The magnitude of contribution from temporal gaze trace is yet unclear for higher resolution/frame rate imaging systems, in which more detailed information about an eye is captured. In this paper, we investigate whether temporal sequences of eye images, captured using a high-resolution, high-frame rate head-mounted virtual reality system, can be leveraged to enhance the accuracy of an end-to-end appearance-based deep-learning model for gaze estimation. Performance is compared against a static-only version of the model. Results demonstrate statistically-significant benefits of temporal information, particularly for the vertical component of gaze.Comment: In ACM Symposium on Eye Tracking Research & Applications (ETRA), 202

    Generative Video Face Reenactment by AUs and Gaze Regularization

    Get PDF
    In this work, we propose an encoder-decoder-like architecture to perform face reenact- ment in image sequences. Our goal is to transfer the training subject identity to a given test subject, regularazing the generation with Action units and gaze vectors to generate more lifelike results

    Towards End-to-end Video-based Eye-Tracking

    Full text link
    Estimating eye-gaze from images alone is a challenging task, in large parts due to un-observable person-specific factors. Achieving high accuracy typically requires labeled data from test users which may not be attainable in real applications. We observe that there exists a strong relationship between what users are looking at and the appearance of the user's eyes. In response to this understanding, we propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. Our video dataset consists of time-synchronized screen recordings, user-facing camera views, and eye gaze data, which allows for new benchmarks in temporal gaze tracking as well as label-free refinement of gaze. Importantly, we demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures acquired through supervised personalization. Our final method yields significant performance improvements on our proposed EVE dataset, with up to a 28 percent improvement in Point-of-Gaze estimates (resulting in 2.49 degrees in angular error), paving the path towards high-accuracy screen-based eye tracking purely from webcam sensors. The dataset and reference source code are available at https://ait.ethz.ch/projects/2020/EVEComment: Accepted at ECCV 202

    Dialogue Management and Language Generation for a Robust Conversational Virtual Coach: Validation and User Study

    Get PDF
    Designing human–machine interactive systems requires cooperation between different disciplines is required. In this work, we present a Dialogue Manager and a Language Generator that are the core modules of a Voice-based Spoken Dialogue System (SDS) capable of carrying out challenging, long and complex coaching conversations. We also develop an efficient integration procedure of the whole system that will act as an intelligent and robust Virtual Coach. The coaching task significantly differs from the classical applications of SDSs, resulting in a much higher degree of complexity and difficulty. The Virtual Coach has been successfully tested and validated in a user study with independent elderly, in three different countries with three different languages and cultures: Spain, France and Norway.The research presented in this paper has been conducted as part of the project EMPATHIC that has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant No. 769872. Additionally, this work has been partially funded by projects BEWORD and AMIC-PC of the Minister of Science of Technology, under Grant Nos. PID2021-126061OB-C42 and PDC2021-120846-C43, respectively. Vázquez and López Zorrilla received a PhD scholarship from the Basque Government, with Grant Nos. PRE 2020 1 0274 and PRE 2017 1 0357, respectively
    corecore