5,342 research outputs found

    Single image 3D human pose estimation from noisy observations

    Get PDF
    Markerless 3D human pose detection from a single image is a severely underconstrained problem because different 3D poses can have similar image projections. In order to handle this ambiguity, current approaches rely on prior shape models that can only be correctly adjusted if 2D image features are accurately detected. Unfortunately, although current 2D part detector algorithms have shown promising results, they are not yet accurate enough to guarantee a complete disambiguation of the 3D inferred shape. In this paper, we introduce a novel approach for estimating 3D human pose even when observations are noisy. We propose a stochastic sampling strategy to propagate the noise from the image plane to the shape space. This provides a set of ambiguous 3D shapes, which are virtually undistinguishable from their image projections. Disambiguation is then achieved by imposing kinematic constraints that guarantee the resulting pose resembles a 3D human shape. We validate the method on a variety of situations in which state-of-the-art 2D detectors yield either inaccurate estimations or partly miss some of the body parts.Preprin

    3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

    Full text link
    We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201

    Forecasting Human Dynamics from Static Images

    Full text link
    This paper presents the first study on forecasting human dynamics from static images. The problem is to input a single RGB image and generate a sequence of upcoming human body poses in 3D. To address the problem, we propose the 3D Pose Forecasting Network (3D-PFNet). Our 3D-PFNet integrates recent advances on single-image human pose estimation and sequence prediction, and converts the 2D predictions into 3D space. We train our 3D-PFNet using a three-step training strategy to leverage a diverse source of training data, including image and video based human pose datasets and 3D motion capture (MoCap) data. We demonstrate competitive performance of our 3D-PFNet on 2D pose forecasting and 3D pose recovery through quantitative and qualitative results.Comment: Accepted in CVPR 201

    A New Sparse Representation Algorithm for 3D Human Pose Estimation

    Get PDF
    This paper addresses the problem of recovering 3D human pose from single 2D images using Sparse Representation. While recent Sparse Representation (SR) based 3D human pose estimation methods have attained promising results estimating human poses from single images, their performance depends on the availability of large labeled datasets. However, in many real world applications, accessing to sufficient labeled data may be expensive and/or time consuming, but it is relatively easy to acquire a large amount of unlabeled data. Moreover, all SR based 3D pose estimation methods only consider the information of the input feature space and they cannot utilize the information of the pose space. In this paper, we propose a new framework based on sparse representation for 3D human pose estimation which uses both the labeled and unlabeled data. Furthermore, the proposed method can exploit the information of the pose space to improve the pose estimation accuracy. Experimental results show that the performance of the proposed method is significantly better than the state of the art 3D human pose estimation methods

    Estimating 2D Upper Body Poses from Monocular Images

    Get PDF
    Automatic estimation and recognition of poses from video allows for a whole range of applications. The research described here is an important step towards automatic extraction of 3D poses. We describe our research to extract the 2D joint locations of the people in meeting videos. The key point of the research described here is that we generalize over variations in appearance of both people and scene. This results in a robust detection of 2D joint locations. For the detection of different limbs, we employ a number of limb locators. Each of these uses a different set of image features. We evaluate our work on two videos that have been recorded in the meeting context. Our results are promising, yielding an average error of approximately 3-5 cm per joint

    Recognition-Based Motion Capture and the HumanEva II Test Data

    Get PDF
    Quantitative comparison of algorithms for human motion capture have been hindered by the lack of standard benchmarks. The development of the HumanEva I & II test sets provides an opportunity to assess the state of the art by evaluating existing methods on the new standardized test videos. This paper presents a comprehensive evaluation of a monocular recognition-based pose recovery algorithm on the HumanEva II clips. The results show that the method achieves a mean relative error of around 10-12 cm per joint
    corecore