6,180 research outputs found

    A multi-projector CAVE system with commodity hardware and gesture-based interaction

    Get PDF
    Spatially-immersive systems such as CAVEs provide users with surrounding worlds by projecting 3D models on multiple screens around the viewer. Compared to alternative immersive systems such as HMDs, CAVE systems are a powerful tool for collaborative inspection of virtual environments due to better use of peripheral vision, less sensitivity to tracking errors, and higher communication possibilities among users. Unfortunately, traditional CAVE setups require sophisticated equipment including stereo-ready projectors and tracking systems with high acquisition and maintenance costs. In this paper we present the design and construction of a passive-stereo, four-wall CAVE system based on commodity hardware. Our system works with any mix of a wide range of projector models that can be replaced independently at any time, and achieves high resolution and brightness at a minimum cost. The key ingredients of our CAVE are a self-calibration approach that guarantees continuity across the screen, as well as a gesture-based interaction approach based on a clever combination of skeletal data from multiple Kinect sensors.Preprin

    Multi-view passive 3D face acquisition device

    Get PDF
    Approaches to acquisition of 3D facial data include laser scanners, structured light devices and (passive) stereo vision. The laser scanner and structured light methods allow accurate reconstruction of the 3D surface but strong light is projected on the faces of subjects. Passive stereo vision based approaches do not require strong light to be projected, however, it is hard to obtain comparable accuracy and robustness of the surface reconstruction. In this paper a passive multiple view approach using 5 cameras in a ’+’ configuration is proposed that significantly increases robustness and accuracy relative to traditional stereo vision approaches. The normalised cross correlations of all 5 views are combined using direct projection of points instead of the traditionally used rectified images. Also, errors caused by different perspective deformation of the surface in the different views are reduced by using an iterative reconstruction technique where the depth estimation of the previous iteration is used to warp the windows of the normalised cross correlation for the different views

    Vision and Learning for Deliberative Monocular Cluttered Flight

    Full text link
    Cameras provide a rich source of information while being passive, cheap and lightweight for small and medium Unmanned Aerial Vehicles (UAVs). In this work we present the first implementation of receding horizon control, which is widely used in ground vehicles, with monocular vision as the only sensing mode for autonomous UAV flight in dense clutter. We make it feasible on UAVs via a number of contributions: novel coupling of perception and control via relevant and diverse, multiple interpretations of the scene around the robot, leveraging recent advances in machine learning to showcase anytime budgeted cost-sensitive feature selection, and fast non-linear regression for monocular depth prediction. We empirically demonstrate the efficacy of our novel pipeline via real world experiments of more than 2 kms through dense trees with a quadrotor built from off-the-shelf parts. Moreover our pipeline is designed to combine information from other modalities like stereo and lidar as well if available

    Unobtrusive and pervasive video-based eye-gaze tracking

    Get PDF
    Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe

    Appearance-Based Gaze Estimation in the Wild

    Full text link
    Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

    Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

    Get PDF
    The importance of depth perception in the interactions that humans have within their nearby space is a well established fact. Consequently, it is also well known that the possibility of exploiting good stereo information would ease and, in many cases, enable, a large variety of attentional and interactive behaviors on humanoid robotic platforms. However, the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras often prevents from relying on this kind of cue to visually guide robots' attention and actions in real-world scenarios. The contribution of this paper is two-fold: first, we show that the Efficient Large-scale Stereo Matching algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map is well suited to be used on a humanoid robotic platform as the iCub robot; second, we show how, provided with a fast and reliable stereo system, implementing relatively challenging visual behaviors in natural settings can require much less effort. As a case of study we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. Indeed this example paves the way to a variety of other similar applications
    corecore