17 research outputs found

    A Differential Approach for Gaze Estimation

    Full text link
    Non-invasive gaze estimation methods usually regress gaze directions directly from a single face or eye image. However, due to important variabilities in eye shapes and inner eye structures amongst individuals, universal models obtain limited accuracies and their output usually exhibit high variance as well as biases which are subject dependent. Therefore, increasing accuracy is usually done through calibration, allowing gaze predictions for a subject to be mapped to his/her actual gaze. In this paper, we introduce a novel image differential method for gaze estimation. We propose to directly train a differential convolutional neural network to predict the gaze differences between two eye input images of the same subject. Then, given a set of subject specific calibration images, we can use the inferred differences to predict the gaze direction of a novel eye sample. The assumption is that by allowing the comparison between two eye images, annoyance factors (alignment, eyelid closing, illumination perturbations) which usually plague single image prediction methods can be much reduced, allowing better prediction altogether. Experiments on 3 public datasets validate our approach which constantly outperforms state-of-the-art methods even when using only one calibration sample or when the latter methods are followed by subject specific gaze adaptation.Comment: Extension to our paper A differential approach for gaze estimation with calibration (BMVC 2018) Submitted to PAMI on Aug. 7th, 2018 Accepted by PAMI short on Dec. 2019, in IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Automated Deception Detection from Videos: Using End-to-End Learning Based High-Level Features and Classification Approaches

    Full text link
    Deception detection is an interdisciplinary field attracting researchers from psychology, criminology, computer science, and economics. We propose a multimodal approach combining deep learning and discriminative models for automated deception detection. Using video modalities, we employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions, achieving promising results compared to state-of-the-art methods. Due to limited training data, we also utilize discriminative models for deception detection. Although sequence-to-class approaches are explored, discriminative models outperform them due to data scarcity. Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors. Results indicate that facial expressions outperform gaze and head pose, and combining modalities with feature selection enhances detection performance. Differences in expressed features across datasets emphasize the importance of scenario-specific training data and the influence of context on deceptive behavior. Cross-dataset experiments reinforce these findings. Despite the challenges posed by low-stake datasets, including the Rolling-Dice Experiment, deception detection performance exceeds chance levels. Our proposed multimodal approach and comprehensive evaluation shed light on the potential of automating deception detection from video modalities, opening avenues for future research.Comment: 29 pages, 17 figures (19 if counting subfigures

    Federated learning in gaze recognition (FLIGR)

    Get PDF
    The efficiency and generalizability of a deep learning model is based on the amount and diversity of training data. Although huge amounts of data are being collected, these data are not stored in centralized servers for further data processing. It is often infeasible to collect and share data in centralized servers due to various medical data regulations. This need for diversely distributed data and infeasible storage solutions calls for Federated Learning (FL). FL is a clever way of utilizing privately stored data in model building without the need for data sharing. The idea is to train several different models locally with same architecture, share the model weights between the collaborators, aggregate the model weights and use the resulting global weights in furthering model building. FL is an iterative algorithm which repeats the above steps over defined number of rounds. By doing so, we negate the need for centralized data sharing and avoid several regulations tied to it. In this work, federated learning is applied to gaze recognition, a task to identify where the doctor’s gaze at. A global model is built by repeatedly aggregating local models built from 8 local institutional data using the FL algorithm for 4 federated rounds. The results show increase in the performance of the global model over federated rounds. The study also shows that the global model can be trained one more time locally at the end of FL on each institutional level to fine-tune the model to local data

    A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation

    Full text link
    Human gaze is essential for various appealing applications. Aiming at more accurate gaze estimation, a series of recent works propose to utilize face and eye images simultaneously. Nevertheless, face and eye images only serve as independent or parallel feature sources in those works, the intrinsic correlation between their features is overlooked. In this paper we make the following contributions: 1) We propose a coarse-to-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images. 2) Guided by the proposed strategy, we design a framework which introduces a bi-gram model to bridge gaze residual and basic gaze direction, and an attention component to adaptively acquire suitable fine-grained feature. 3) Integrating the above innovations, we construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap.Comment: 9 pages, 7figures, AAAI-2
    corecore