33 research outputs found
Deformable and articulated 3D reconstruction from monocular video sequences
PhDThis thesis addresses the problem of deformable and articulated structure from motion from
monocular uncalibrated video sequences. Structure from motion is defined as the problem of
recovering information about the 3D structure of scenes imaged by a camera in a video sequence.
Our study aims at the challenging problem of non-rigid shapes (e.g. a beating heart or a smiling
face). Non-rigid structures appear constantly in our everyday life, think of a bicep curling, a
torso twisting or a smiling face. Our research seeks a general method to perform 3D shape
recovery purely from data, without having to rely on a pre-computed model or training data.
Open problems in the field are the difficulty of the non-linear estimation, the lack of a real-time
system, large amounts of missing data in real-world video sequences, measurement noise and
strong deformations. Solving these problems would take us far beyond the current state of the
art in non-rigid structure from motion. This dissertation presents our contributions in the field
of non-rigid structure from motion, detailing a novel algorithm that enforces the exact metric
structure of the problem at each step of the minimisation by projecting the motion matrices
onto the correct deformable or articulated metric motion manifolds respectively. An important
advantage of this new algorithm is its ability to handle missing data which becomes crucial
when dealing with real video sequences. We present a generic bilinear estimation framework,
which improves convergence and makes use of the manifold constraints. Finally, we demonstrate
a sequential, frame-by-frame estimation algorithm, which provides a 3D model and camera
parameters for each video frame, while simultaneously building a model of object deformation
Manifold Constrained Low-Rank Decomposition
Low-rank decomposition (LRD) is a state-of-the-art method for visual data
reconstruction and modelling. However, it is a very challenging problem when
the image data contains significant occlusion, noise, illumination variation,
and misalignment from rotation or viewpoint changes. We leverage the specific
structure of data in order to improve the performance of LRD when the data are
not ideal. To this end, we propose a new framework that embeds manifold priors
into LRD. To implement the framework, we design an alternating direction method
of multipliers (ADMM) method which efficiently integrates the manifold
constraints during the optimization process. The proposed approach is
successfully used to calculate low-rank models from face images, hand-written
digits and planar surface images. The results show a consistent increase of
performance when compared to the state-of-the-art over a wide range of
realistic image misalignments and corruptions
3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach
We investigate the problem of estimating the 3D shape of an object, given a
set of 2D landmarks in a single image. To alleviate the reconstruction
ambiguity, a widely-used approach is to confine the unknown 3D shape within a
shape space built upon existing shapes. While this approach has proven to be
successful in various applications, a challenging issue remains, i.e., the
joint estimation of shape parameters and camera-pose parameters requires to
solve a nonconvex optimization problem. The existing methods often adopt an
alternating minimization scheme to locally update the parameters, and
consequently the solution is sensitive to initialization. In this paper, we
propose a convex formulation to address this problem and develop an efficient
algorithm to solve the proposed convex program. We demonstrate the exact
recovery property of the proposed method, its merits compared to alternative
methods, and the applicability in human pose and car shape estimation.Comment: In Proceedings of CVPR 201
A Benchmark and Evaluation of Non-Rigid Structure from Motion
Non-Rigid structure from motion (NRSfM), is a long standing and central
problem in computer vision, allowing us to obtain 3D information from multiple
images when the scene is dynamic. A main issue regarding the further
development of this important computer vision topic, is the lack of high
quality data sets. We here address this issue by presenting of data set
compiled for this purpose, which is made publicly available, and considerably
larger than previous state of the art. To validate the applicability of this
data set, and provide and investigation into the state of the art of NRSfM,
including potential directions forward, we here present a benchmark and a
scrupulous evaluation using this data set. This benchmark evaluates 16
different methods with available code, which we argue reasonably spans the
state of the art in NRSfM. We also hope, that the presented and public data set
and evaluation, will provide benchmark tools for further development in this
field
Unsupervised 3D reconstruction and grouping of rigid and non-rigid categories
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we present an approach to jointly recover camera pose, 3D shape, and object and deformation type grouping, from incomplete 2D annotations in a multi-instance collection of RGB images. Our approach is able to handle indistinctly both rigid and non-rigid categories. This advances existing work, which only addresses the problem for one single object or, they assume the groups to be known a priori when multiple instances are handled. In order to address this broader version of the problem, we encode object deformation by means of multiple unions of subspaces, that is able to span from small rigid motion to complex deformations. The model parameters are learned via Augmented Lagrange Multipliers, in a completely unsupervised manner that does not require any training data at all. Extensive experimental evaluation is provided in a wide variety of synthetic and real scenarios, including rigid and non-rigid categories with small and large deformations. We obtain state-of-the-art solutions in terms of 3D reconstruction accuracy, while also providing grouping results that allow splitting the input images into object instances and their associated type of deformation.Peer ReviewedPostprint (author's final draft