814 research outputs found

    Robust Hand Motion Capture and Physics-Based Control for Grasping in Real Time

    Get PDF
    Hand motion capture technologies are being explored due to high demands in the fields such as video game, virtual reality, sign language recognition, human-computer interaction, and robotics. However, existing systems suffer a few limitations, e.g. they are high-cost (expensive capture devices), intrusive (additional wear-on sensors or complex configurations), and restrictive (limited motion varieties and restricted capture space). This dissertation mainly focus on exploring algorithms and applications for the hand motion capture system that is low-cost, non-intrusive, low-restriction, high-accuracy, and robust. More specifically, we develop a realtime and fully-automatic hand tracking system using a low-cost depth camera. We first introduce an efficient shape-indexed cascaded pose regressor that directly estimates 3D hand poses from depth images. A unique property of our hand pose regressor is to utilize a low-dimensional parametric hand geometric model to learn 3D shape-indexed features robust to variations in hand shapes, viewpoints and hand poses. We further introduce a hybrid tracking scheme that effectively complements our hand pose regressor with model-based hand tracking. In addition, we develop a rapid 3D hand shape modeling method that uses a small number of depth images to accurately construct a subject-specific skinned mesh model for hand tracking. This step not only automates the whole tracking system but also improves the robustness and accuracy of model-based tracking and hand pose regression. Additionally, we also propose a physically realistic human grasping synthesis method that is capable to grasp a wide variety of objects. Given an object to be grasped, our method is capable to compute required controls (e.g. forces and torques) that advance the simulation to achieve realistic grasping. Our method combines the power of data-driven synthesis and physics-based grasping control. We first introduce a data-driven method to synthesize a realistic grasping motion from large sets of prerecorded grasping motion data. And then we transform the synthesized kinematic motion to a physically realistic one by utilizing our online physics-based motion control method. In addition, we also provide a performance interface which allows the user to act out before a depth camera to control a virtual object

    Robust Hand Motion Capture and Physics-Based Control for Grasping in Real Time

    Get PDF
    Hand motion capture technologies are being explored due to high demands in the fields such as video game, virtual reality, sign language recognition, human-computer interaction, and robotics. However, existing systems suffer a few limitations, e.g. they are high-cost (expensive capture devices), intrusive (additional wear-on sensors or complex configurations), and restrictive (limited motion varieties and restricted capture space). This dissertation mainly focus on exploring algorithms and applications for the hand motion capture system that is low-cost, non-intrusive, low-restriction, high-accuracy, and robust. More specifically, we develop a realtime and fully-automatic hand tracking system using a low-cost depth camera. We first introduce an efficient shape-indexed cascaded pose regressor that directly estimates 3D hand poses from depth images. A unique property of our hand pose regressor is to utilize a low-dimensional parametric hand geometric model to learn 3D shape-indexed features robust to variations in hand shapes, viewpoints and hand poses. We further introduce a hybrid tracking scheme that effectively complements our hand pose regressor with model-based hand tracking. In addition, we develop a rapid 3D hand shape modeling method that uses a small number of depth images to accurately construct a subject-specific skinned mesh model for hand tracking. This step not only automates the whole tracking system but also improves the robustness and accuracy of model-based tracking and hand pose regression. Additionally, we also propose a physically realistic human grasping synthesis method that is capable to grasp a wide variety of objects. Given an object to be grasped, our method is capable to compute required controls (e.g. forces and torques) that advance the simulation to achieve realistic grasping. Our method combines the power of data-driven synthesis and physics-based grasping control. We first introduce a data-driven method to synthesize a realistic grasping motion from large sets of prerecorded grasping motion data. And then we transform the synthesized kinematic motion to a physically realistic one by utilizing our online physics-based motion control method. In addition, we also provide a performance interface which allows the user to act out before a depth camera to control a virtual object

    Realtime State Estimation with Tactile and Visual sensing. Application to Planar Manipulation

    Full text link
    Accurate and robust object state estimation enables successful object manipulation. Visual sensing is widely used to estimate object poses. However, in a cluttered scene or in a tight workspace, the robot's end-effector often occludes the object from the visual sensor. The robot then loses visual feedback and must fall back on open-loop execution. In this paper, we integrate both tactile and visual input using a framework for solving the SLAM problem, incremental smoothing and mapping (iSAM), to provide a fast and flexible solution. Visual sensing provides global pose information but is noisy in general, whereas contact sensing is local, but its measurements are more accurate relative to the end-effector. By combining them, we aim to exploit their advantages and overcome their limitations. We explore the technique in the context of a pusher-slider system. We adapt iSAM's measurement cost and motion cost to the pushing scenario, and use an instrumented setup to evaluate the estimation quality with different object shapes, on different surface materials, and under different contact modes

    Learning to Use Chopsticks in Diverse Gripping Styles

    Full text link
    Learning dexterous manipulation skills is a long-standing challenge in computer graphics and robotics, especially when the task involves complex and delicate interactions between the hands, tools and objects. In this paper, we focus on chopsticks-based object relocation tasks, which are common yet demanding. The key to successful chopsticks skills is steady gripping of the sticks that also supports delicate maneuvers. We automatically discover physically valid chopsticks holding poses by Bayesian Optimization (BO) and Deep Reinforcement Learning (DRL), which works for multiple gripping styles and hand morphologies without the need of example data. Given as input the discovered gripping poses and desired objects to be moved, we build physics-based hand controllers to accomplish relocation tasks in two stages. First, kinematic trajectories are synthesized for the chopsticks and hand in a motion planning stage. The key components of our motion planner include a grasping model to select suitable chopsticks configurations for grasping the object, and a trajectory optimization module to generate collision-free chopsticks trajectories. Then we train physics-based hand controllers through DRL again to track the desired kinematic trajectories produced by the motion planner. We demonstrate the capabilities of our framework by relocating objects of various shapes and sizes, in diverse gripping styles and holding positions for multiple hand morphologies. Our system achieves faster learning speed and better control robustness, when compared to vanilla systems that attempt to learn chopstick-based skills without a gripping pose optimization module and/or without a kinematic motion planner

    Environment capturing with Microsoft Kinect

    Get PDF

    Learning to Fly by Crashing

    Full text link
    How do you learn to navigate an Unmanned Aerial Vehicle (UAV) and avoid obstacles? One approach is to use a small dataset collected by human experts: however, high capacity learning algorithms tend to overfit when trained with little data. An alternative is to use simulation. But the gap between simulation and real world remains large especially for perception problems. The reason most research avoids using large-scale real data is the fear of crashes! In this paper, we propose to bite the bullet and collect a dataset of crashes itself! We build a drone whose sole purpose is to crash into objects: it samples naive trajectories and crashes into random objects. We crash our drone 11,500 times to create one of the biggest UAV crash dataset. This dataset captures the different ways in which a UAV can crash. We use all this negative flying data in conjunction with positive data sampled from the same trajectories to learn a simple yet powerful policy for UAV navigation. We show that this simple self-supervised model is quite effective in navigating the UAV even in extremely cluttered environments with dynamic obstacles including humans. For supplementary video see: https://youtu.be/u151hJaGKU

    Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

    Full text link
    Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.Comment: Accepted for publication by the International Journal of Computer Vision (IJCV) on 16.02.2016 (submitted on 17.10.14). A combination into a single framework of an ECCV'12 multicamera-RGB and a monocular-RGBD GCPR'14 hand tracking paper with several extensions, additional experiments and detail
    • …
    corecore