530 research outputs found

    Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

    Full text link
    We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we first use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fit it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.Comment: To appear in ECCV 201

    A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects

    Get PDF
    This thesis examines the problem of pose estimation, which is the problem of determining the pose of an object in some coordinate system. Pose refers to the object\u27s position and orientation in the coordinate system. In particular, this thesis examines pose estimation techniques using either monocular or binocular vision systems. Generally, when trying to find the pose of an object the objective is to generate a set of matching features, which may be points or lines, between a model of the object and the current image of the object. These matches can then be used to determine the pose of the object which was imaged. The algorithms presented in this thesis all generate possible matches and then use these matches to generate poses. The two monocular pose estimation techniques examined are two versions of SoftPOSIT: the traditional approach using point features, and a more recent approach using line features. The algorithms function in very much the same way with the only difference being the features used by the algorithms. Both algorithms are started with a random initial guess of the object\u27s pose. Using this pose a set of possible point matches is generated, and then using these matches the pose is refined so that the distances between matched points are reduced. Once the pose is refined, a new set of matches is generated. The process is then repeated until convergence, i.e., minimal or no change in the pose. The matched features depend on the initial pose, thus the algorithm\u27s output is dependent upon the initially guessed pose. By starting the algorithm with a variety of different poses, the goal of the algorithm is to determine the correct correspondences and then generate the correct pose. The binocular pose estimation technique presented attempts to match 3-D point data from a model of an object, to 3-D point data generated from the current view of the object. In both cases the point data is generated using a stereo camera. This algorithm attempts to match 3-D point triplets in the model to 3-D point triplets from the current view, and then use these matched triplets to obtain the pose parameters that describe the object\u27s location and orientation in space. The results of attempting to determine the pose of three different low texture manufactured objects across a sample set of 95 images are presented using each algorithm. The results of the two monocular methods are directly compared and examined. The results of the binocular method are examined as well, and then all three algorithms are compared. Out of the three methods, the best performing algorithm, by a significant margin, was found to be the binocular method. The types of objects searched for all had low feature counts, low surface texture variation, and multiple degrees of symmetry. The results indicate that it is generally hard to robustly determine the pose of these types of objects. Finally, suggestions are made for improvements that could be made to the algorithms which may lead to better pose results

    Monocular-Based Pose Determination of Uncooperative Space Objects

    Get PDF
    Vision-based methods to determine the relative pose of an uncooperative orbiting object are investigated in applications to spacecraft proximity operations, such as on-orbit servicing, spacecraft formation flying, and small bodies exploration. Depending on whether the object is known or unknown, a shape model of the orbiting target object may have to be constructed autonomously in real-time by making use of only optical measurements. The Simultaneous Estimation of Pose and Shape (SEPS) algorithm that does not require a priori knowledge of the pose and shape of the target is presented. This makes use of a novel measurement equation and filter that can efficiently use optical flow information along with a star tracker to estimate the target's angular rotational and translational relative velocity as well as its center of gravity. Depending on the mission constraints, SEPS can be augmented by a more accurate offline, on-board 3D reconstruction of the target shape, which allows for the estimation of the pose as a known target. The use of Structure from Motion (SfM) for this purpose is discussed. A model-based approach for pose estimation of known targets is also presented. The architecture and implementation of both the proposed approaches are elucidated and their performance metrics are evaluated through numerical simulations by using a dataset of images that are synthetically generated according to a chaser/target relative motion in Geosynchronous Orbit (GEO)

    Development of a calibration pipeline for a monocular-view structured illumination 3D sensor utilizing an array projector

    Get PDF
    Commercial off-the-shelf digital projection systems are commonly used in active structured illumination photogrammetry of macro-scale surfaces due to their relatively low cost, accessibility, and ease of use. They can be described as inverse pinhole modelled. The calibration pipeline of a 3D sensor utilizing pinhole devices in a projector-camera setup configuration is already well-established. Recently, there have been advances in creating projection systems offering projection speeds greater than that available from conventional off-the-shelf digital projectors. However, they cannot be calibrated using well established techniques based on the pinole assumption. They are chip-less and without projection lens. This work is based on the utilization of unconventional projection systems known as array projectors which contain not one but multiple projection channels that project a temporal sequence of illumination patterns. None of the channels implement a digital projection chip or a projection lens. To workaround the calibration problem, previous realizations of a 3D sensor based on an array projector required a stereo-camera setup. Triangulation took place between the two pinhole modelled cameras instead. However, a monocular setup is desired as a single camera configuration results in decreased cost, weight, and form-factor. This study presents a novel calibration pipeline that realizes a single camera setup. A generalized intrinsic calibration process without model assumptions was developed that directly samples the illumination frustum of each array projection channel. An extrinsic calibration process was then created that determines the pose of the single camera through a downhill simplex optimization initialized by particle swarm. Lastly, a method to store the intrinsic calibration with the aid of an easily realizable calibration jig was developed for re-use in arbitrary measurement camera positions so that intrinsic calibration does not have to be repeated
    corecore