799 research outputs found
On the Calibration of Active Binocular and RGBD Vision Systems for Dual-Arm Robots
This paper describes a camera and hand-eye
calibration methodology for integrating an active binocular
robot head within a dual-arm robot. For this purpose, we
derive the forward kinematic model of our active robot head
and describe our methodology for calibrating and integrating
our robot head. This rigid calibration provides a closedform
hand-to-eye solution. We then present an approach for
updating dynamically camera external parameters for optimal
3D reconstruction that are the foundation for robotic tasks such
as grasping and manipulating rigid and deformable objects. We
show from experimental results that our robot head achieves
an overall sub millimetre accuracy of less than 0.3 millimetres
while recovering the 3D structure of a scene. In addition, we
report a comparative study between current RGBD cameras
and our active stereo head within two dual-arm robotic testbeds
that demonstrates the accuracy and portability of our proposed
methodology
AFFECT-PRESERVING VISUAL PRIVACY PROTECTION
The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding.
The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection.
The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Conventional feature-based and model-based gaze estimation methods have
proven to perform well in settings with controlled illumination and specialized
cameras. In unconstrained real-world settings, however, such methods are
surpassed by recent appearance-based methods due to difficulties in modeling
factors such as illumination changes and other visual artifacts. We present a
novel learning-based method for eye region landmark localization that enables
conventional methods to be competitive to latest appearance-based methods.
Despite having been trained exclusively on synthetic data, our method exceeds
the state of the art for iris localization and eye shape registration on
real-world imagery. We then use the detected landmarks as input to iterative
model-fitting and lightweight learning-based gaze estimation methods. Our
approach outperforms existing model-fitting and appearance-based methods in the
context of person-independent and personalized gaze estimation
Unobtrusive and pervasive video-based eye-gaze tracking
Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe
Subjective Annotations for Vision-Based Attention Level Estimation
Attention level estimation systems have a high potential in many use cases,
such as human-robot interaction, driver modeling and smart home systems, since
being able to measure a person's attention level opens the possibility to
natural interaction between humans and computers. The topic of estimating a
human's visual focus of attention has been actively addressed recently in the
field of HCI. However, most of these previous works do not consider attention
as a subjective, cognitive attentive state. New research within the field also
faces the problem of the lack of annotated datasets regarding attention level
in a certain context. The novelty of our work is two-fold: First, we introduce
a new annotation framework that tackles the subjective nature of attention
level and use it to annotate more than 100,000 images with three attention
levels and second, we introduce a novel method to estimate attention levels,
relying purely on extracted geometric features from RGB and depth images, and
evaluate it with a deep learning fusion framework. The system achieves an
overall accuracy of 80.02%. Our framework and attention level annotations are
made publicly available.Comment: 14th International Conference on Computer Vision Theory and
Application
High-Accuracy Facial Depth Models derived from 3D Synthetic Data
In this paper, we explore how synthetically generated 3D face models can be
used to construct a high accuracy ground truth for depth. This allows us to
train the Convolutional Neural Networks (CNN) to solve facial depth estimation
problems. These models provide sophisticated controls over image variations
including pose, illumination, facial expressions and camera position. 2D
training samples can be rendered from these models, typically in RGB format,
together with depth information. Using synthetic facial animations, a dynamic
facial expression or facial action data can be rendered for a sequence of image
frames together with ground truth depth and additional metadata such as head
pose, light direction, etc. The synthetic data is used to train a CNN based
facial depth estimation system which is validated on both synthetic and real
images. Potential fields of application include 3D reconstruction, driver
monitoring systems, robotic vision systems, and advanced scene understanding
- …