111 research outputs found
Shape basis interpretation for monocular deformable 3D reconstruction
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper, we propose a novel interpretable shape model to encode object non-rigidity. We first use the initial frames of a monocular video to recover a rest shape, used later to compute a dissimilarity measure based on a distance matrix measurement. Spectral analysis is then applied to this matrix to obtain a reduced shape basis, that in contrast to existing approaches, can be physically interpreted. In turn, these pre-computed shape bases are used to linearly span the deformation of a wide variety of objects. We introduce the low-rank basis into a sequential approach to recover both camera motion and non-rigid shape from the monocular video, by simply optimizing the weights of the linear combination using bundle adjustment. Since the number of parameters to optimize per frame is relatively small, specially when physical priors are considered, our approach is fast and can potentially run in real time. Validation is done in a wide variety of real-world objects, undergoing both inextensible and extensible deformations. Our approach achieves remarkable robustness to artifacts such as noisy and missing measurements and shows an improved performance to competing methods.Peer ReviewedPostprint (author's final draft
Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View
We propose a method for predicting the 3D shape of a deformable surface from
a single view. By contrast with previous approaches, we do not need a
pre-registered template of the surface, and our method is robust to the lack of
texture and partial occlusions. At the core of our approach is a {\it
geometry-aware} deep architecture that tackles the problem as usually done in
analytic solutions: first perform 2D detection of the mesh and then estimate a
3D shape that is geometrically consistent with the image. We train this
architecture in an end-to-end manner using a large dataset of synthetic
renderings of shapes under different levels of deformation, material
properties, textures and lighting conditions. We evaluate our approach on a
test split of this dataset and available real benchmarks, consistently
improving state-of-the-art solutions with a significantly lower computational
time.Comment: Accepted at CVPR 201
Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length
The perspective camera and the isometric surface prior have recently gathered
increased attention for Non-Rigid Structure-from-Motion (NRSfM). Despite the
recent progress, several challenges remain, particularly the computational
complexity and the unknown camera focal length. In this paper we present a
method for incremental Non-Rigid Structure-from-Motion (NRSfM) with the
perspective camera model and the isometric surface prior with unknown focal
length. In the template-based case, we provide a method to estimate four
parameters of the camera intrinsics. For the template-less scenario of NRSfM,
we propose a method to upgrade reconstructions obtained for one focal length to
another based on local rigidity and the so-called Maximum Depth Heuristics
(MDH). On its basis we propose a method to simultaneously recover the focal
length and the non-rigid shapes. We further solve the problem of incorporating
a large number of points and adding more views in MDH-based NRSfM and
efficiently solve them with Second-Order Cone Programming (SOCP). This does not
require any shape initialization and produces results orders of times faster
than many methods. We provide evaluations on standard sequences with
ground-truth and qualitative reconstructions on challenging YouTube videos.
These evaluations show that our method performs better in both speed and
accuracy than the state of the art.Comment: ECCV 201
Non-parametric Depth Estimation for Images from a Single Reference Depth
We present a non-parametric method for estimating depth of a single still image. We start from a single reference image and its corresponding 3-d depth and use an unsupervised neural network to transform the reference depth to represent the target image. In doing so, we attempt to mimic the human vision capability of perceiving the depth of a given image. Existing depth recovery methods either work for scenes with perpendicular planar surfaces or assume availability of a training database of known images and depths. We propose a method that can recover depth of a target image from a single reference depth. We redesign Self-organizing map (SOM) to learn in an environment with only three input data points and each data point with a different semantic meaning. We combine the proposed Parallel SOM (PSOM) with Gabor wavelets to handle discrepancy between the target and reference images in lighting and orientation. The proposed method gives promising results on images of faces and of daily objects even when using reference image and depth obtained in a poorly lighted setting.
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
Geometry-aware network for non-rigid shape prediction from a single view
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksWe propose a method for predicting the 3D shape of a deformable surface from a single view. By contrast with previous approaches, we do not need a pre-registered template of the surface, and our method is robust to the lack of texture and partial occlusions. At the core of our approach is a {it geometry-aware} deep architecture that tackles the problem as usually done in analytic solutions: first perform 2D detection of the mesh and then estimate a 3D shape that is geometrically consistent with the image. We train this architecture in an end-to-end manner using a large dataset of synthetic renderings of shapes under different levels of deformation, material properties, textures and lighting conditions. We evaluate our approach on a test split of this dataset and available real benchmarks, consistently improving state-of-the-art solutions with a significantly lower computational time.Peer ReviewedPostprint (author's final draft
- …