4,765 research outputs found

    Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

    Get PDF
    The visual focus of attention (VFOA) has been recognized as a prominent conversational cue. We are interested in estimating and tracking the VFOAs associated with multi-party social interactions. We note that in this type of situations the participants either look at each other or at an object of interest; therefore their eyes are not always visible. Consequently both gaze and VFOA estimation cannot be based on eye detection and tracking. We propose a method that exploits the correlation between eye gaze and head movements. Both VFOA and gaze are modeled as latent variables in a Bayesian switching state-space model. The proposed formulation leads to a tractable learning procedure and to an efficient algorithm that simultaneously tracks gaze and visual focus. The method is tested and benchmarked using two publicly available datasets that contain typical multi-party human-robot and human-human interactions.Comment: 15 pages, 8 figures, 6 table

    EyePACT: eye-based parallax correction on touch-enabled interactive displays

    Get PDF
    The parallax effect describes the displacement between the perceived and detected touch locations on a touch-enabled surface. Parallax is a key usability challenge for interactive displays, particularly for those that require thick layers of glass between the screen and the touch surface to protect them from vandalism. To address this challenge, we present EyePACT, a method that compensates for input error caused by parallax on public displays. Our method uses a display-mounted depth camera to detect the user's 3D eye position in front of the display and the detected touch location to predict the perceived touch location on the surface. We evaluate our method in two user studies in terms of parallax correction performance as well as multi-user support. Our evaluations demonstrate that EyePACT (1) significantly improves accuracy even with varying gap distances between the touch surface and the display, (2) adapts to different levels of parallax by resulting in significantly larger corrections with larger gap distances, and (3) maintains a significantly large distance between two users' fingers when interacting with the same object. These findings are promising for the development of future parallax-free interactive displays

    Appearance-Based Gaze Estimation in the Wild

    Full text link
    Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

    Gaze and Gestures in Telepresence: multimodality, embodiment, and roles of collaboration

    Full text link
    This paper proposes a controlled experiment to further investigate the usefulness of gaze awareness and gesture recognition in the support of collaborative work at a distance. We propose to redesign experiments conducted several years ago with more recent technology that would: a) enable to better study of the integration of communication modalities, b) allow users to freely move while collaborating at a distance and c) avoid asymmetries of communication between collaborators.Comment: Position paper, International Workshop New Frontiers in Telepresence 2010, part of CSCW2010, Savannah, GA, USA, 7th of February, 2010. http://research.microsoft.com/en-us/events/nft2010

    The Evolution of First Person Vision Methods: A Survey

    Full text link
    The emergence of new wearable technologies such as action cameras and smart-glasses has increased the interest of computer vision scientists in the First Person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with First Person Vision recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real-time, is expected. Current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user machine interaction and so on. This paper summarizes the evolution of the state of the art in First Person Vision video analysis between 1997 and 2014, highlighting, among others, most commonly used features, methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart Glasses, Computer Vision, Video Analytics, Human-machine Interactio
    corecore