1,096 research outputs found
Gaze Estimation From Multimodal Kinect Data
This paper addresses the problem of free gaze estimation under unrestricted head motion. More precisely, unlike previous approaches that mainly focus on estimating gaze towards a small planar screen, we propose a method to estimate the gaze direction in the 3D space. In this context the paper makes the following contributions: (i) leveraging on Kinect device, we propose a multimodal method that rely on depth sensing to obtain robust and accurate head pose tracking even under large head pose, and on the visual data to obtain the remaining eye-in-head gaze directional information from the eye image; (ii) a rectification scheme of the image that exploits the 3D mesh tracking, allowing to conduct a head pose free eye-in-head gaze directional estimation; (iii) a simple way of collecting ground truth data thanks to the Kinect device. Results on three users demonstrate the great potential of our approach
Multimodal Observation and Interpretation of Subjects Engaged in Problem Solving
In this paper we present the first results of a pilot experiment in the
capture and interpretation of multimodal signals of human experts engaged in
solving challenging chess problems. Our goal is to investigate the extent to
which observations of eye-gaze, posture, emotion and other physiological
signals can be used to model the cognitive state of subjects, and to explore
the integration of multiple sensor modalities to improve the reliability of
detection of human displays of awareness and emotion. We observed chess players
engaged in problems of increasing difficulty while recording their behavior.
Such recordings can be used to estimate a participant's awareness of the
current situation and to predict ability to respond effectively to challenging
situations. Results show that a multimodal approach is more accurate than a
unimodal one. By combining body posture, visual attention and emotion, the
multimodal approach can reach up to 93% of accuracy when determining player's
chess expertise while unimodal approach reaches 86%. Finally this experiment
validates the use of our equipment as a general and reproducible tool for the
study of participants engaged in screen-based interaction and/or problem
solving
Comparative analysis of Kinect-based and Oculus-based gaze region estimation methods in a driving simulator
ProducciĂłn CientĂficaDriver’s gaze information can be crucial in driving research because of its relation to driver attention. Particularly, the inclusion of gaze data in driving simulators broadens the scope of research studies as they can relate drivers’ gaze patterns to their features and performance. In this paper, we present two gaze region estimation modules integrated in a driving simulator. One uses the 3D Kinect device and another uses the virtual reality Oculus Rift device. The modules are able to detect the region, out of seven in which the driving scene was divided, where a driver is gazing at in every route processed frame. Four methods were implemented and compared for gaze estimation, which learn the relation between gaze displacement and head movement. Two are simpler and based on points that try to capture this relation and two are based on classifiers such as MLP and SVM. Experiments were carried out with 12 users that drove on the same scenario twice, each one with a different visualization display, first with a big screen and later with Oculus Rift. On the whole, Oculus Rift outperformed Kinect as the best hardware for gaze estimation. The Oculus-based gaze region estimation method with the highest performance achieved an accuracy of 97.94%. The information provided by the Oculus Rift module enriches the driving simulator data and makes it possible a multimodal driving performance analysis apart from the immersion and realism obtained with the virtual reality experience provided by Oculus.DirecciĂłn General de Tráfico y Ministerio del Interior - (Proyecto SPIP2015-01801
Appearance-Based Gaze Estimation in the Wild
Appearance-based gaze estimation is believed to work well in real-world
settings, but existing datasets have been collected under controlled laboratory
conditions and methods have been not evaluated across multiple datasets. In
this work we study appearance-based gaze estimation in the wild. We present the
MPIIGaze dataset that contains 213,659 images we collected from 15 participants
during natural everyday laptop use over more than three months. Our dataset is
significantly more variable than existing ones with respect to appearance and
illumination. We also present a method for in-the-wild appearance-based gaze
estimation using multimodal convolutional neural networks that significantly
outperforms state-of-the art methods in the most challenging cross-dataset
evaluation. We present an extensive evaluation of several state-of-the-art
image-based gaze estimation algorithms on three current datasets, including our
own. This evaluation provides clear insights and allows us to identify key
research challenges of gaze estimation in the wild
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
- …