2,705 research outputs found

    3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

    Full text link
    We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201

    Physics-Based Modeling of Nonrigid Objects for Vision and Graphics (Dissertation)

    Get PDF
    This thesis develops a physics-based framework for 3D shape and nonrigid motion modeling for computer vision and computer graphics. In computer vision it addresses the problems of complex 3D shape representation, shape reconstruction, quantitative model extraction from biomedical data for analysis and visualization, shape estimation, and motion tracking. In computer graphics it demonstrates the generative power of our framework to synthesize constrained shapes, nonrigid object motions and object interactions for the purposes of computer animation. Our framework is based on the use of a new class of dynamically deformable primitives which allow the combination of global and local deformations. It incorporates physical constraints to compose articulated models from deformable primitives and provides force-based techniques for fitting such models to sparse, noise-corrupted 2D and 3D visual data. The framework leads to shape and nonrigid motion estimators that exploit dynamically deformable models to track moving 3D objects from time-varying observations. We develop models with global deformation parameters which represent the salient shape features of natural parts, and local deformation parameters which capture shape details. In the context of computer graphics, these models represent the physics-based marriage of the parameterized and free-form modeling paradigms. An important benefit of their global/local descriptive power in the context of computer vision is that it can potentially satisfy the often conflicting requirements of shape reconstruction and shape recognition. The Lagrange equations of motion that govern our models, augmented by constraints, make them responsive to externally applied forces derived from input data or applied by the user. This system of differential equations is discretized using finite element methods and simulated through time using standard numerical techniques. We employ these equations to formulate a shape and nonrigid motion estimator. The estimator is a continuous extended Kalman filter that recursively transforms the discrepancy between the sensory data and the estimated model state into generalized forces. These adjust the translational, rotational, and deformational degrees of freedom such that the model evolves in a consistent fashion with the noisy data. We demonstrate the interactive time performance of our techniques in a series of experiments in computer vision, graphics, and visualization

    Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View

    Get PDF
    We propose a method for predicting the 3D shape of a deformable surface from a single view. By contrast with previous approaches, we do not need a pre-registered template of the surface, and our method is robust to the lack of texture and partial occlusions. At the core of our approach is a {\it geometry-aware} deep architecture that tackles the problem as usually done in analytic solutions: first perform 2D detection of the mesh and then estimate a 3D shape that is geometrically consistent with the image. We train this architecture in an end-to-end manner using a large dataset of synthetic renderings of shapes under different levels of deformation, material properties, textures and lighting conditions. We evaluate our approach on a test split of this dataset and available real benchmarks, consistently improving state-of-the-art solutions with a significantly lower computational time.Comment: Accepted at CVPR 201

    Multi-View Face Recognition From Single RGBD Models of the Faces

    Get PDF
    This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks

    Deformable and articulated 3D reconstruction from monocular video sequences

    Get PDF
    PhDThis thesis addresses the problem of deformable and articulated structure from motion from monocular uncalibrated video sequences. Structure from motion is defined as the problem of recovering information about the 3D structure of scenes imaged by a camera in a video sequence. Our study aims at the challenging problem of non-rigid shapes (e.g. a beating heart or a smiling face). Non-rigid structures appear constantly in our everyday life, think of a bicep curling, a torso twisting or a smiling face. Our research seeks a general method to perform 3D shape recovery purely from data, without having to rely on a pre-computed model or training data. Open problems in the field are the difficulty of the non-linear estimation, the lack of a real-time system, large amounts of missing data in real-world video sequences, measurement noise and strong deformations. Solving these problems would take us far beyond the current state of the art in non-rigid structure from motion. This dissertation presents our contributions in the field of non-rigid structure from motion, detailing a novel algorithm that enforces the exact metric structure of the problem at each step of the minimisation by projecting the motion matrices onto the correct deformable or articulated metric motion manifolds respectively. An important advantage of this new algorithm is its ability to handle missing data which becomes crucial when dealing with real video sequences. We present a generic bilinear estimation framework, which improves convergence and makes use of the manifold constraints. Finally, we demonstrate a sequential, frame-by-frame estimation algorithm, which provides a 3D model and camera parameters for each video frame, while simultaneously building a model of object deformation
    • …
    corecore