1,272 research outputs found
When Computer Vision Gazes at Cognition
Joint attention is a core, early-developing form of social interaction. It is
based on our ability to discriminate the third party objects that other people
are looking at. While it has been shown that people can accurately determine
whether another person is looking directly at them versus away, little is known
about human ability to discriminate a third person gaze directed towards
objects that are further away, especially in unconstraint cases where the
looker can move her head and eyes freely. In this paper we address this
question by jointly exploring human psychophysics and a cognitively motivated
computer vision model, which can detect the 3D direction of gaze from 2D face
images. The synthesis of behavioral study and computer vision yields several
interesting discoveries. (1) Human accuracy of discriminating targets
8{\deg}-10{\deg} of visual angle apart is around 40% in a free looking gaze
task; (2) The ability to interpret gaze of different lookers vary dramatically;
(3) This variance can be captured by the computational model; (4) Human
outperforms the current model significantly. These results collectively show
that the acuity of human joint attention is indeed highly impressive, given the
computational challenge of the natural looking task. Moreover, the gap between
human and model performance, as well as the variability of gaze interpretation
across different lookers, require further understanding of the underlying
mechanisms utilized by humans for this challenging task.Comment: Tao Gao and Daniel Harari contributed equally to this wor
On the Calibration of Active Binocular and RGBD Vision Systems for Dual-Arm Robots
This paper describes a camera and hand-eye
calibration methodology for integrating an active binocular
robot head within a dual-arm robot. For this purpose, we
derive the forward kinematic model of our active robot head
and describe our methodology for calibrating and integrating
our robot head. This rigid calibration provides a closedform
hand-to-eye solution. We then present an approach for
updating dynamically camera external parameters for optimal
3D reconstruction that are the foundation for robotic tasks such
as grasping and manipulating rigid and deformable objects. We
show from experimental results that our robot head achieves
an overall sub millimetre accuracy of less than 0.3 millimetres
while recovering the 3D structure of a scene. In addition, we
report a comparative study between current RGBD cameras
and our active stereo head within two dual-arm robotic testbeds
that demonstrates the accuracy and portability of our proposed
methodology
Refining personal and social presence in virtual meetings
Virtual worlds show promise for conducting meetings and conferences without the need for physical travel. Current experience suggests the major limitation to the more widespread adoption and acceptance of virtual conferences is the failure of existing environments to provide a sense of immersion and engagement, or of ābeing thereā. These limitations are largely related to the appearance and control of avatars, and to the absence of means to convey non-verbal cues of facial expression and body language. This paper reports on a study involving the use of a mass-market motion sensor (Kinectā¢) and the mapping of participant action in the real world to avatar behaviour in the virtual world. This is coupled with full-motion video representation of participantās faces on their avatars to resolve both identity and facial expression issues. The outcomes of a small-group trial meeting based on this technology show a very positive reaction from participants, and the potential for further exploration of these concepts
- ā¦