3,214 research outputs found

    A Novel Approach to 3-D Gaze Tracking Using Stereo Cameras

    Full text link

    Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

    Get PDF
    The importance of depth perception in the interactions that humans have within their nearby space is a well established fact. Consequently, it is also well known that the possibility of exploiting good stereo information would ease and, in many cases, enable, a large variety of attentional and interactive behaviors on humanoid robotic platforms. However, the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras often prevents from relying on this kind of cue to visually guide robots' attention and actions in real-world scenarios. The contribution of this paper is two-fold: first, we show that the Efficient Large-scale Stereo Matching algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map is well suited to be used on a humanoid robotic platform as the iCub robot; second, we show how, provided with a fast and reliable stereo system, implementing relatively challenging visual behaviors in natural settings can require much less effort. As a case of study we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. Indeed this example paves the way to a variety of other similar applications

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade

    Appearance-Based Gaze Estimation in the Wild

    Full text link
    Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

    Intention recognition for gaze controlled robotic minimally invasive laser ablation

    Get PDF
    Eye tracking technology has shown promising results for allowing hands-free control of robotically-mounted cameras and tools. However existing systems present only limited capabilities in allowing the full range of camera motions in a safe, intuitive manner. This paper introduces a framework for the recognition of surgeon intention, allowing activation and control of the camera through natural gaze behaviour. The system is resistant to noise such as blinking, while allowing the surgeon to look away safely at any time. Furthermore, this paper presents a novel approach to control the translation of the camera along its optical axis using a combination of eye tracking and stereo reconstruction. Combining eye tracking and stereo reconstruction allows the system to determine which point in 3D space the user is fixating, enabling a translation of the camera to achieve the optimal viewing distance. In addition, the eye tracking information is used to perform automatic laser targeting for laser ablation. The desired target point of the laser, mounted on a separate robotic arm, is determined with the eye tracking thus removing the need to manually adjust the laser's target point before starting each new ablation. The calibration methodology used to obtain millimetre precision for the laser targeting without the aid of visual servoing is described. Finally, a user study validating the system is presented, showing clear improvement with median task times under half of those of a manually controlled robotic system
    • 

    corecore