109 research outputs found
UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture
We present UnrealEgo, i.e., a new large-scale naturalistic dataset for
egocentric 3D human pose estimation. UnrealEgo is based on an advanced concept
of eyeglasses equipped with two fisheye cameras that can be used in
unconstrained environments. We design their virtual prototype and attach them
to 3D human models for stereo view capture. We next generate a large corpus of
human motions. As a consequence, UnrealEgo is the first dataset to provide
in-the-wild stereo images with the largest variety of motions among existing
egocentric datasets. Furthermore, we propose a new benchmark method with a
simple but effective idea of devising a 2D keypoint estimation module for
stereo inputs to improve 3D human pose estimation. The extensive experiments
show that our approach outperforms the previous state-of-the-art methods
qualitatively and quantitatively. UnrealEgo and our source codes are available
on our project web page.Comment: 21 pages, 10 figures, 10 tables; project page:
https://4dqv.mpi-inf.mpg.de/UnrealEgo
VisionaryVR: An Optical Simulation Tool for Evaluating and Optimizing Vision Correction Solutions in Virtual Reality
Developing and evaluating vision science methods require robust and efficient
tools for assessing their performance in various real-world scenarios. This
study presents a novel virtual reality (VR) simulation tool that simulates
real-world optical methods while giving high experimental control to the
experiment. The tool incorporates an experiment controller, to smoothly and
easily handle multiple conditions, a generic eye-tracking controller, that
works with most common VR eye-trackers, a configurable defocus simulator, and a
generic VR questionnaire loader to assess participants' behavior in virtual
reality. This VR-based simulation tool bridges the gap between theoretical and
applied research on new optical methods, corrections, and therapies. It enables
vision scientists to increase their research tools with a robust, realistic,
and fast research environment
Computational See-Through Near-Eye Displays
See-through near-eye displays with the form factor and field of view of eyeglasses are a natural choice for augmented reality systems: the non-encumbering size enables casual and extended use and large field of view enables general-purpose spatially registered applications. However, designing displays with these attributes is currently an open problem. Support for enhanced realism through mutual occlusion and the focal depth cues is also not found in eyeglasses-like displays. This dissertation provides a new strategy for eyeglasses-like displays that follows the principles of computational displays, devices that rely on software as a fundamental part of image formation. Such devices allow more hardware simplicity and flexibility, showing greater promise of meeting form factor and field of view goals while enhancing realism. This computational approach is realized in two novel and complementary see-through near-eye display designs. The first subtractive approach filters omnidirectional light through a set of optimized patterns displayed on a stack of spatial light modulators, reproducing a light field corresponding to in-focus imagery. The design is thin and scales to wide fields of view; see-through is achieved with transparent components placed directly in front of the eye. Preliminary support for focal cues and environment occlusion is also demonstrated. The second additive approach uses structured point light illumination to form an image with a minimal set of rays. Each of an array of defocused point light sources is modulated by a region of a spatial light modulator, essentially encoding an image in the focal blur. See-through is also achieved with transparent components and thin form factors and wide fields of view (>= 100 degrees) are demonstrated. The designs are examined in theoretical terms, in simulation, and through prototype hardware with public demonstrations. This analysis shows that the proposed computational near-eye display designs offer a significantly different set of trade-offs than conventional optical designs. Several challenges remain to make the designs practical, most notably addressing diffraction limits.Doctor of Philosoph
Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Conventional feature-based and model-based gaze estimation methods have
proven to perform well in settings with controlled illumination and specialized
cameras. In unconstrained real-world settings, however, such methods are
surpassed by recent appearance-based methods due to difficulties in modeling
factors such as illumination changes and other visual artifacts. We present a
novel learning-based method for eye region landmark localization that enables
conventional methods to be competitive to latest appearance-based methods.
Despite having been trained exclusively on synthetic data, our method exceeds
the state of the art for iris localization and eye shape registration on
real-world imagery. We then use the detected landmarks as input to iterative
model-fitting and lightweight learning-based gaze estimation methods. Our
approach outperforms existing model-fitting and appearance-based methods in the
context of person-independent and personalized gaze estimation
Live Video and Image Recolouring for Colour Vision Deficient Patients
Colour Vision Deficiency (CVD) is an important issue for a significant population across the globe. There are several types of CVD\u27s, such as monochromacy, dichromacy, trichromacy, and anomalous trichromacy. Each of these categories contain specific other subtypes. The aim of this research is to device a scheme to address CVD by using variations in pixel plotting of colours to capture colour disparities and perform colour compensation. The proposed scheme recolours the video and images by colour contrast variation of each colour for CVD patients, and depending on the type of deficiency, it is able to provide live results. Different types of CVD’s can be identified and cured by changing the particular colour related to it and based upon the type of diseases, it performs RGB (Red, Green, and Blue) to LMS (Long, Medium, and Short) transformation. This helps in colour identification and also adjustments of colour contrasts. The processing and rendering of recoloured video and images, allows the affected patients with CVD to see perfect shades in the recoloured frames of video or images and other modes of files. In this thesis, we propose an efficient recolouring algorithm with a strong focus on real-time applications that is capable of providing different recoloured outputs based on specific types of CVD
CLERA: A Unified Model for Joint Cognitive Load and Eye Region Analysis in the Wild
Non-intrusive, real-time analysis of the dynamics of the eye region allows us
to monitor humans' visual attention allocation and estimate their mental state
during the performance of real-world tasks, which can potentially benefit a
wide range of human-computer interaction (HCI) applications. While commercial
eye-tracking devices have been frequently employed, the difficulty of
customizing these devices places unnecessary constraints on the exploration of
more efficient, end-to-end models of eye dynamics. In this work, we propose
CLERA, a unified model for Cognitive Load and Eye Region Analysis, which
achieves precise keypoint detection and spatiotemporal tracking in a
joint-learning framework. Our method demonstrates significant efficiency and
outperforms prior work on tasks including cognitive load estimation, eye
landmark detection, and blink estimation. We also introduce a large-scale
dataset of 30k human faces with joint pupil, eye-openness, and landmark
annotation, which aims to support future HCI research on human factors and
eye-related analysis.Comment: ACM Transactions on Computer-Human Interactio
Multimodality with Eye tracking and Haptics: A New Horizon for Serious Games?
The goal of this review is to illustrate the emerging use of multimodal virtual reality that can benefit learning-based games. The review begins with an introduction to multimodal virtual reality in serious games and we provide a brief discussion of why cognitive processes involved in learning and training are enhanced under immersive virtual environments. We initially outline studies that have used eye tracking and haptic feedback independently in serious games, and then review some innovative applications that have already combined eye tracking and haptic devices in order to provide applicable multimodal frameworks for learning-based games. Finally, some general conclusions are identified and clarified in order to advance current understanding in multimodal serious game production as well as exploring possible areas for new applications
Generalized Anthropomorphic Functional Grasping with Minimal Demonstrations
This article investigates the challenge of achieving functional tool-use
grasping with high-DoF anthropomorphic hands, with the aim of enabling
anthropomorphic hands to perform tasks that require human-like manipulation and
tool-use. However, accomplishing human-like grasping in real robots present
many challenges, including obtaining diverse functional grasps for a wide
variety of objects, handling generalization ability for kinematically diverse
robot hands and precisely completing object shapes from a single-view
perception. To tackle these challenges, we propose a six-step grasp synthesis
algorithm based on fine-grained contact modeling that generates physically
plausible and human-like functional grasps for category-level objects with
minimal human demonstrations. With the contact-based optimization and learned
dense shape correspondence, the proposed algorithm is adaptable to various
objects in same category and a board range of robot hand models. To further
demonstrate the robustness of the framework, over 10K functional grasps are
synthesized to train our neural network, named DexFG-Net, which generates
diverse sets of human-like functional grasps based on the reconstructed object
model produced by a shape completion module. The proposed framework is
extensively validated in simulation and on a real robot platform. Simulation
experiments demonstrate that our method outperforms baseline methods by a large
margin in terms of grasp functionality and success rate. Real robot experiments
show that our method achieved an overall success rate of 79\% and 68\% for
tool-use grasp on 3-D printed and real test objects, respectively, using a
5-Finger Schunk Hand. The experimental results indicate a step towards
human-like grasping with anthropomorphic hands.Comment: 20 pages, 23 figures and 7 table
- …