714 research outputs found

    Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

    Full text link
    Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.Comment: Accepted for publication by the International Journal of Computer Vision (IJCV) on 16.02.2016 (submitted on 17.10.14). A combination into a single framework of an ECCV'12 multicamera-RGB and a monocular-RGBD GCPR'14 hand tracking paper with several extensions, additional experiments and detail

    Real-Time Hand Tracking Using a Sum of Anisotropic Gaussians Model

    Full text link
    Real-time marker-less hand tracking is of increasing importance in human-computer interaction. Robust and accurate tracking of arbitrary hand motion is a challenging problem due to the many degrees of freedom, frequent self-occlusions, fast motions, and uniform skin color. In this paper, we propose a new approach that tracks the full skeleton motion of the hand from multiple RGB cameras in real-time. The main contributions include a new generative tracking method which employs an implicit hand shape representation based on Sum of Anisotropic Gaussians (SAG), and a pose fitting energy that is smooth and analytically differentiable making fast gradient based pose optimization possible. This shape representation, together with a full perspective projection model, enables more accurate hand modeling than a related baseline method from literature. Our method achieves better accuracy than previous methods and runs at 25 fps. We show these improvements both qualitatively and quantitatively on publicly available datasets.Comment: 8 pages, Accepted version of paper published at 3DV 201

    GRAB: A Dataset of Whole-Body Human Grasping of Objects

    Full text link
    Training computers to understand, model, and synthesize human grasping requires a rich dataset containing complex 3D object shapes, detailed contact information, hand pose and shape, and the 3D body motion over time. While "grasping" is commonly thought of as a single hand stably lifting an object, we capture the motion of the entire body and adopt the generalized notion of "whole-body grasps". Thus, we collect a new dataset, called GRAB (GRasping Actions with Bodies), of whole-body grasps, containing full 3D shape and pose sequences of 10 subjects interacting with 51 everyday objects of varying shape and size. Given MoCap markers, we fit the full 3D body shape and pose, including the articulated face and hands, as well as the 3D object pose. This gives detailed 3D meshes over time, from which we compute contact between the body and object. This is a unique dataset, that goes well beyond existing ones for modeling and understanding how humans grasp and manipulate objects, how their full body is involved, and how interaction varies with the task. We illustrate the practical value of GRAB with an example application; we train GrabNet, a conditional generative network, to predict 3D hand grasps for unseen 3D object shapes. The dataset and code are available for research purposes at https://grab.is.tue.mpg.de.Comment: ECCV 202

    Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects

    Get PDF
    Hand motion capture with an RGB-D sensor gained recently a lot of research attention, however, even most recent approaches focus on the case of a single isolated hand. We focus instead on hands that interact with other hands or with a rigid or articulated object. Our framework successfully captures motion in such scenarios by combining a generative model with discriminatively trained salient points, collision detection and physics simulation to achieve a low tracking error with physically plausible poses. All components are unified in a single objective function that can be optimized with standard optimization techniques. We initially assume a-priori knowledge of the object’s shape and skeleton. In case of unknown object shape there are existing 3d reconstruction methods that capitalize on distinctive geometric or texture features. These methods though fail for textureless and highly symmetric objects like household articles, mechanical parts or toys. We show that extracting 3d hand motion for in-hand scanning e↵ectively facilitates the reconstruction of such objects and we fuse the rich additional information of hands into a 3d reconstruction pipeline. Finally, although shape reconstruction is enough for rigid objects, there is a lack of tools that build rigged models of articulated objects that deform realistically using RGB-D data. We propose a method that creates a fully rigged model consisting of a watertight mesh, embedded skeleton and skinning weights by employing a combination of deformable mesh tracking, motion segmentation based on spectral clustering and skeletonization based on mean curvature flow

    Towards Full-Body Gesture Analysis and Recognition

    Get PDF
    With computers being embedded in every walk of our life, there is an increasing demand forintuitive devices for human-computer interaction. As human beings use gestures as importantmeans of communication, devices based on gesture recognition systems will be effective for humaninteraction with computers. However, it is very important to keep such a system as non-intrusive aspossible, to reduce the limitations of interactions. Designing such non-intrusive, intuitive, camerabasedreal-time gesture recognition system has been an active area of research research in the fieldof computer vision.Gesture recognition invariably involves tracking body parts. We find many research works intracking body parts like eyes, lips, face etc. However, there is relatively little work being done onfull body tracking. Full-body tracking is difficult because it is expensive to model the full-body aseither 2D or 3D model and to track its movements.In this work, we propose a monocular gesture recognition system that focuses on recognizing a setof arm movements commonly used to direct traffic, guiding aircraft landing and for communicationover long distances. This is an attempt towards implementing gesture recognition systems thatrequire full body tracking, for e.g. an automated recognition semaphore flag signaling system.We have implemented a robust full-body tracking system, which forms the backbone of ourgesture analyzer. The tracker makes use of two dimensional link-joint (LJ) model, which representsthe human body, for tracking. Currently, we track the movements of the arms in a video sequence,however we have future plans to make the system real-time. We use distance transform techniquesto track the movements by fitting the parameters of LJ model in every frames of the video captured.The tracker\u27s output is fed a to state-machine which identifies the gestures made. We haveimplemented this system using four sub-systems. Namely1. Background subtraction sub-system, using Gaussian models and median filters.2. Full-body Tracker, using L-J Model APIs3. Quantizer, that converts tracker\u27s output into defined alphabets4. Gesture analyzer, that reads the alphabets into action performed.Currently, our gesture vocabulary contains gestures involving arms moving up and down which canbe used for detecting semaphore, flag signaling system. Also we can detect gestures like clappingand waving of arms
    • …
    corecore