33 research outputs found

    Bilinear Modeling via Augmented Lagrange Multipliers (BALM)

    Full text link

    Deformable and articulated 3D reconstruction from monocular video sequences

    Get PDF
    PhDThis thesis addresses the problem of deformable and articulated structure from motion from monocular uncalibrated video sequences. Structure from motion is defined as the problem of recovering information about the 3D structure of scenes imaged by a camera in a video sequence. Our study aims at the challenging problem of non-rigid shapes (e.g. a beating heart or a smiling face). Non-rigid structures appear constantly in our everyday life, think of a bicep curling, a torso twisting or a smiling face. Our research seeks a general method to perform 3D shape recovery purely from data, without having to rely on a pre-computed model or training data. Open problems in the field are the difficulty of the non-linear estimation, the lack of a real-time system, large amounts of missing data in real-world video sequences, measurement noise and strong deformations. Solving these problems would take us far beyond the current state of the art in non-rigid structure from motion. This dissertation presents our contributions in the field of non-rigid structure from motion, detailing a novel algorithm that enforces the exact metric structure of the problem at each step of the minimisation by projecting the motion matrices onto the correct deformable or articulated metric motion manifolds respectively. An important advantage of this new algorithm is its ability to handle missing data which becomes crucial when dealing with real video sequences. We present a generic bilinear estimation framework, which improves convergence and makes use of the manifold constraints. Finally, we demonstrate a sequential, frame-by-frame estimation algorithm, which provides a 3D model and camera parameters for each video frame, while simultaneously building a model of object deformation

    Manifold Constrained Low-Rank Decomposition

    Full text link
    Low-rank decomposition (LRD) is a state-of-the-art method for visual data reconstruction and modelling. However, it is a very challenging problem when the image data contains significant occlusion, noise, illumination variation, and misalignment from rotation or viewpoint changes. We leverage the specific structure of data in order to improve the performance of LRD when the data are not ideal. To this end, we propose a new framework that embeds manifold priors into LRD. To implement the framework, we design an alternating direction method of multipliers (ADMM) method which efficiently integrates the manifold constraints during the optimization process. The proposed approach is successfully used to calculate low-rank models from face images, hand-written digits and planar surface images. The results show a consistent increase of performance when compared to the state-of-the-art over a wide range of realistic image misalignments and corruptions

    3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

    Full text link
    We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image. To alleviate the reconstruction ambiguity, a widely-used approach is to confine the unknown 3D shape within a shape space built upon existing shapes. While this approach has proven to be successful in various applications, a challenging issue remains, i.e., the joint estimation of shape parameters and camera-pose parameters requires to solve a nonconvex optimization problem. The existing methods often adopt an alternating minimization scheme to locally update the parameters, and consequently the solution is sensitive to initialization. In this paper, we propose a convex formulation to address this problem and develop an efficient algorithm to solve the proposed convex program. We demonstrate the exact recovery property of the proposed method, its merits compared to alternative methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201

    A Benchmark and Evaluation of Non-Rigid Structure from Motion

    Full text link
    Non-Rigid structure from motion (NRSfM), is a long standing and central problem in computer vision, allowing us to obtain 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting of data set compiled for this purpose, which is made publicly available, and considerably larger than previous state of the art. To validate the applicability of this data set, and provide and investigation into the state of the art of NRSfM, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 16 different methods with available code, which we argue reasonably spans the state of the art in NRSfM. We also hope, that the presented and public data set and evaluation, will provide benchmark tools for further development in this field

    Unsupervised 3D reconstruction and grouping of rigid and non-rigid categories

    Get PDF
    © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we present an approach to jointly recover camera pose, 3D shape, and object and deformation type grouping, from incomplete 2D annotations in a multi-instance collection of RGB images. Our approach is able to handle indistinctly both rigid and non-rigid categories. This advances existing work, which only addresses the problem for one single object or, they assume the groups to be known a priori when multiple instances are handled. In order to address this broader version of the problem, we encode object deformation by means of multiple unions of subspaces, that is able to span from small rigid motion to complex deformations. The model parameters are learned via Augmented Lagrange Multipliers, in a completely unsupervised manner that does not require any training data at all. Extensive experimental evaluation is provided in a wide variety of synthetic and real scenarios, including rigid and non-rigid categories with small and large deformations. We obtain state-of-the-art solutions in terms of 3D reconstruction accuracy, while also providing grouping results that allow splitting the input images into object instances and their associated type of deformation.Peer ReviewedPostprint (author's final draft
    corecore