5 research outputs found

    Gesture controlled interactive rendering in a panoramic scene

    Get PDF
    The demonstration described hereafter covers technical work carried out in the FascinatE project [1], related to the interactive retrieval and rendering of high-resolution panoramic scenes. The scenes have been captured by a special panoramic camera (the OMNICAM) [2] with is capturing high resolution video featuring a wide angle (180 degrees) field of view. Users can access the content by interacting based on a novel device-less and markerless gesture-based system that allows them to interact as naturally as possible, permitting the user to control the rendering of the scene by zooming, panning or framing through the panoramic scenePeer ReviewedPostprint (published version

    Can our TV robustly understand human gestures? Real-Time Gesture Localization in Range Data

    No full text
    Google Best Student Paper Award CVMP 2012The 'old' remote falls short of requirements when confronted with digital convergence for living room displays. Enriched options to watch, manage and interact with content on large displays demand improved means of interaction. Concurrently, gesture recognition is increasingly present in human-computer interaction for gaming applications. In this paper we propose a gesture localization framework for interactive display of audio-visual content. The proposed framework works with range data captured from a single consumer depth camera. We focus on still gestures because they are generally user friendly (users do not have to make complex and tiring movements) and allow formulating the problem in terms of object localization. Our method is based on random forests, which have shown an excellent performance on classification and regression tasks. In this work, however, we aim at a specific class of localization problems involving highly unbalanced data: positive examples appear during a small fraction of space and time. We study the impact of this natural unbalance on the random forest learning and we propose a framework to robustly detect gestures on range images in real applications. Our experiments with offline data show the effectiveness of our approach. We also present a real-time application where users can control the TV display with a reduced set of still gestures.Peer ReviewedAward-winnin

    Can our TV robustly understand human gestures? Real-Time Gesture Localization in Range Data

    No full text
    Google Best Student Paper Award CVMP 2012The 'old' remote falls short of requirements when confronted with digital convergence for living room displays. Enriched options to watch, manage and interact with content on large displays demand improved means of interaction. Concurrently, gesture recognition is increasingly present in human-computer interaction for gaming applications. In this paper we propose a gesture localization framework for interactive display of audio-visual content. The proposed framework works with range data captured from a single consumer depth camera. We focus on still gestures because they are generally user friendly (users do not have to make complex and tiring movements) and allow formulating the problem in terms of object localization. Our method is based on random forests, which have shown an excellent performance on classification and regression tasks. In this work, however, we aim at a specific class of localization problems involving highly unbalanced data: positive examples appear during a small fraction of space and time. We study the impact of this natural unbalance on the random forest learning and we propose a framework to robustly detect gestures on range images in real applications. Our experiments with offline data show the effectiveness of our approach. We also present a real-time application where users can control the TV display with a reduced set of still gestures.Peer ReviewedAward-winnin

    Recognition of Objects and Gestures in Image

    Get PDF
    Tato práce je zaměřena na rozpoznávání gest ve videu. Cílem práce bylo vytvořit aplikaci, která by umožnila pomocí obyčejné webové kamery rozpoznávání malého množství gest za účelem ovládání počítače. Pro tento účel bylo použito vybraných metod pro získávání deskriptorů z obrazu, pro sledování určitých oblastí ve videu a strojového učení.This thesis is focused on gesture recognition in video. The main purpose of this thesis was to create an algorithm and an application that can recognize selected gestures using a~video obtained through a~standard webcamera. The intention was to control an application program, such as video player. The approach used to achieve this goal was to exploit methods of feature extraction, tracking, and machine learning.

    Stochastic optimization and interactive machine learning for human motion analysis

    Get PDF
    The analysis of human motion from visual data is a central issue in the computer vision research community as it enables a wide range of applications and it still remains a challenging problem when dealing with unconstrained scenarios and general conditions. Human motion analysis is used in the entertainment industry for movies or videogame production, in medical applications for rehabilitation or biomechanical studies. It is also used for human computer interaction in any kind of environment, and moreover, it is used for big data analysis from social networks such as Youtube or Flickr, to mention some of its use cases. In this thesis we have studied human motion analysis techniques with a focus on its application for smart room environments. That is, we have studied methods that will support the analysis of people behavior in the room, allowing interaction with computers in a natural manner and in general, methods that introduce computers in human activity environments to enable new kind of services but in an unobstrusive mode. The thesis is structured in two parts, where we study the problem of 3D pose estimation from multiple views and the recognition of gestures using range sensors. First, we propose a generic framework for hierarchically layered particle filtering (HPF) specially suited for motion capture tasks. Human motion capture problem generally involve tracking or optimization of high-dimensional state vectors where also one have to deal with multi-modal pdfs. HPF allow to overcome the problem by means of multiple passes through substate space variables. Then, based on the HPF framework, we propose a method to estimate the anthropometry of the subject, which at the end allows to obtain a human body model adjusted to the subject. Moreover, we introduce a new weighting function strategy for approximate partitioning of observations and a method that employs body part detections to improve particle propagation and weight evaluation, both integrated within the HPF framework. The second part of this thesis is centered in the detection of gestures, and we have focused the problem of reducing annotation and training efforts required to train a specific gesture. In order to reduce the efforts required to train a gesture detector, we propose a solution based on online random forests that allows training in real-time, while receiving new data in sequence. The main aspect that makes the solution effective is the method we propose to collect the hard negatives examples while training the forests. The method uses the detector trained up to the current frame to test on that frame, and then collects samples based on the response of the detector such that they will be more relevant for training. In this manner, training is more effective in terms of the number of annotated frames required.L'anàlisi del moviment humà a partir de dades visuals és un tema central en la recerca en visió per computador, per una banda perquè habilita un ampli espectre d'aplicacions i per altra perquè encara és un problema no resolt quan és aplicat en escenaris no controlats. L'analisi del moviment humà s'utilitza a l'indústria de l'entreteniment per la producció de pel·lícules i videojocs, en aplicacions mèdiques per rehabilitació o per estudis bio-mecànics. També s'utilitza en el camp de la interacció amb computadors o també per l'analisi de grans volums de dades de xarxes socials com Youtube o Flickr, per mencionar alguns exemples. En aquesta tesi s'han estudiat tècniques per l'anàlisi de moviment humà enfocant la seva aplicació en entorns de sales intel·ligents. És a dir, s'ha enfocat a mètodes que puguin permetre l'anàlisi del comportament de les persones a la sala, que permetin la interacció amb els dispositius d'una manera natural i, en general, mètodes que incorporin les computadores en espais on hi ha activitat de persones, per habilitar nous serveis de manera que no interfereixin en la activitat. A la primera part, es proposa un marc genèric per l'ús de filtres de partícules jeràrquics (HPF) especialment adequat per tasques de captura de moviment humà. La captura de moviment humà generalment implica seguiment i optimització de vectors d'estat de molt alta dimensió on a la vegada també s'han de tractar pdf's multi-modals. Els HPF permeten tractar aquest problema mitjançant multiples passades en subdivisions del vector d'estat. Basant-nos en el marc dels HPF, es proposa un mètode per estimar l'antropometria del subjecte, que a la vegada permet obtenir un model acurat del subjecte. També proposem dos nous mètodes per la captura de moviment humà. Per una banda, el APO es basa en una nova estratègia per les funcions de cost basada en la partició de les observacions. Per altra, el DD-HPF utilitza deteccions de parts del cos per millorar la propagació de partícules i l'avaluació de pesos. Ambdós mètodes són integrats dins el marc dels HPF. La segona part de la tesi es centra en la detecció de gestos, i s'ha enfocat en el problema de reduir els esforços d'anotació i entrenament requerits per entrenar un detector per un gest concret. Per tal de reduir els esforços requerits per entrenar un detector de gestos, proposem una solució basada en online random forests que permet l'entrenament en temps real, mentre es reben noves dades sequencialment. El principal aspecte que fa la solució efectiva és el mètode que proposem per obtenir mostres negatives rellevants, mentre s'entrenen els arbres de decisió. El mètode utilitza el detector entrenat fins al moment per recollir mostres basades en la resposta del detector, de manera que siguin més rellevants per l'entrenament. D'aquesta manera l'entrenament és més efectiu pel que fa al nombre de mostres anotades que es requereixen
    corecore