1,111 research outputs found
A low-cost head and eye tracking system for realistic eye movements in virtual avatars
A virtual avatar or autonomous agent is a digital representation of a human being that can be controlled by either a human or an artificially intelligent computer system. Increasingly avatars are becoming realistic virtual human characters that exhibit human behavioral traits, body language and eye and head movements. As the interpretation of eye and head movements represents an important part of nonverbal human communication it is extremely important to accurately reproduce these movements in virtual avatars to avoid falling into the well-known ``uncanny valley''. In this paper we present a cheap hybrid real-time head and eye tracking system based on existing open source software and commonly available hardware. Our evaluation indicates that the system of head and eye tracking is stable and accurate and can allow a human user to robustly puppet a virtual avatar, potentially allowing us to train an A.I. system to learn realistic human head and eye movements
EyeSite: A Framework for Browser-Based Eye Tracking Studies
The growing use of the web browser in HCI and data visualization presents an opportunity for advancement in eye tracking experiment software. Interactive experiments with features such as dynamic areas of interest and scrolling are difficult and time consuming to analyze with existing tools. EyeSite builds on open-source eye tracking software by communicating in real time with the web browser. This communication is used to transform screen-space gaze coordinates into coordinates on the web page. Point-to-element mapping is performed using DOM elements. EyeSite supports a wide variety of eye tracking hardware and software, remote experimental trials, and easy integration with common research workflows
Real-time gaze estimation using a Kinect and a HD webcam
In human-computer interaction, gaze orientation is an important and promising source of information to demonstrate the attention and focus of users. Gaze detection can also be an extremely useful metric for analysing human mood and affect. Furthermore, gaze can be used as an input method for human-computer interaction. However, currently real-time and accurate gaze estimation is still an open problem. In this paper, we propose a simple and novel estimation model of the real-time gaze direction of a user on a computer screen. This method utilises cheap capturing devices, a HD webcam and a Microsoft Kinect. We consider that the gaze motion from a user facing forwards is composed of the local gaze motion shifted by eye motion and the global gaze motion driven by face motion. We validate our proposed model of gaze estimation and provide experimental evaluation of the reliability and the precision of the method
Multimodal Observation and Interpretation of Subjects Engaged in Problem Solving
In this paper we present the first results of a pilot experiment in the
capture and interpretation of multimodal signals of human experts engaged in
solving challenging chess problems. Our goal is to investigate the extent to
which observations of eye-gaze, posture, emotion and other physiological
signals can be used to model the cognitive state of subjects, and to explore
the integration of multiple sensor modalities to improve the reliability of
detection of human displays of awareness and emotion. We observed chess players
engaged in problems of increasing difficulty while recording their behavior.
Such recordings can be used to estimate a participant's awareness of the
current situation and to predict ability to respond effectively to challenging
situations. Results show that a multimodal approach is more accurate than a
unimodal one. By combining body posture, visual attention and emotion, the
multimodal approach can reach up to 93% of accuracy when determining player's
chess expertise while unimodal approach reaches 86%. Finally this experiment
validates the use of our equipment as a general and reproducible tool for the
study of participants engaged in screen-based interaction and/or problem
solving
Foveated Video Streaming for Cloud Gaming
Good user experience with interactive cloud-based multimedia applications,
such as cloud gaming and cloud-based VR, requires low end-to-end latency and
large amounts of downstream network bandwidth at the same time. In this paper,
we present a foveated video streaming system for cloud gaming. The system
adapts video stream quality by adjusting the encoding parameters on the fly to
match the player's gaze position. We conduct measurements with a prototype that
we developed for a cloud gaming system in conjunction with eye tracker
hardware. Evaluation results suggest that such foveated streaming can reduce
bandwidth requirements by even more than 50% depending on parametrization of
the foveated video coding and that it is feasible from the latency perspective.Comment: Submitted to: IEEE 19th International Workshop on Multimedia Signal
Processin
Towards End-to-end Video-based Eye-Tracking
Estimating eye-gaze from images alone is a challenging task, in large parts
due to un-observable person-specific factors. Achieving high accuracy typically
requires labeled data from test users which may not be attainable in real
applications. We observe that there exists a strong relationship between what
users are looking at and the appearance of the user's eyes. In response to this
understanding, we propose a novel dataset and accompanying method which aims to
explicitly learn these semantic and temporal relationships. Our video dataset
consists of time-synchronized screen recordings, user-facing camera views, and
eye gaze data, which allows for new benchmarks in temporal gaze tracking as
well as label-free refinement of gaze. Importantly, we demonstrate that the
fusion of information from visual stimuli as well as eye images can lead
towards achieving performance similar to literature-reported figures acquired
through supervised personalization. Our final method yields significant
performance improvements on our proposed EVE dataset, with up to a 28 percent
improvement in Point-of-Gaze estimates (resulting in 2.49 degrees in angular
error), paving the path towards high-accuracy screen-based eye tracking purely
from webcam sensors. The dataset and reference source code are available at
https://ait.ethz.ch/projects/2020/EVEComment: Accepted at ECCV 202
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
- …