38,201 research outputs found
A Benchmark and Evaluation of Non-Rigid Structure from Motion
Non-Rigid structure from motion (NRSfM), is a long standing and central
problem in computer vision, allowing us to obtain 3D information from multiple
images when the scene is dynamic. A main issue regarding the further
development of this important computer vision topic, is the lack of high
quality data sets. We here address this issue by presenting of data set
compiled for this purpose, which is made publicly available, and considerably
larger than previous state of the art. To validate the applicability of this
data set, and provide and investigation into the state of the art of NRSfM,
including potential directions forward, we here present a benchmark and a
scrupulous evaluation using this data set. This benchmark evaluates 16
different methods with available code, which we argue reasonably spans the
state of the art in NRSfM. We also hope, that the presented and public data set
and evaluation, will provide benchmark tools for further development in this
field
The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes
Estimating camera motion in deformable scenes poses a complex and open
research challenge. Most existing non-rigid structure from motion techniques
assume to observe also static scene parts besides deforming scene parts in
order to establish an anchoring reference. However, this assumption does not
hold true in certain relevant application cases such as endoscopies. Deformable
odometry and SLAM pipelines, which tackle the most challenging scenario of
exploratory trajectories, suffer from a lack of robustness and proper
quantitative evaluation methodologies. To tackle this issue with a common
benchmark, we introduce the Drunkard's Dataset, a challenging collection of
synthetic data targeting visual navigation and reconstruction in deformable
environments. This dataset is the first large set of exploratory camera
trajectories with ground truth inside 3D scenes where every surface exhibits
non-rigid deformations over time. Simulations in realistic 3D buildings lets us
obtain a vast amount of data and ground truth labels, including camera poses,
RGB images and depth, optical flow and normal maps at high resolution and
quality. We further present a novel deformable odometry method, dubbed the
Drunkard's Odometry, which decomposes optical flow estimates into rigid-body
camera motion and non-rigid scene deformations. In order to validate our data,
our work contains an evaluation of several baselines as well as a novel
tracking error metric which does not require ground truth data. Dataset and
code: https://davidrecasens.github.io/TheDrunkard'sOdometry
Scalable Dense Monocular Surface Reconstruction
This paper reports on a novel template-free monocular non-rigid surface
reconstruction approach. Existing techniques using motion and deformation cues
rely on multiple prior assumptions, are often computationally expensive and do
not perform equally well across the variety of data sets. In contrast, the
proposed Scalable Monocular Surface Reconstruction (SMSR) combines strengths of
several algorithms, i.e., it is scalable with the number of points, can handle
sparse and dense settings as well as different types of motions and
deformations. We estimate camera pose by singular value thresholding and
proximal gradient. Our formulation adopts alternating direction method of
multipliers which converges in linear time for large point track matrices. In
the proposed SMSR, trajectory space constraints are integrated by smoothing of
the measurement matrix. In the extensive experiments, SMSR is demonstrated to
consistently achieve state-of-the-art accuracy on a wide variety of data sets.Comment: International Conference on 3D Vision (3DV), Qingdao, China, October
201
Video Interpolation using Optical Flow and Laplacian Smoothness
Non-rigid video interpolation is a common computer vision task. In this paper
we present an optical flow approach which adopts a Laplacian Cotangent Mesh
constraint to enhance the local smoothness. Similar to Li et al., our approach
adopts a mesh to the image with a resolution up to one vertex per pixel and
uses angle constraints to ensure sensible local deformations between image
pairs. The Laplacian Mesh constraints are expressed wholly inside the optical
flow optimization, and can be applied in a straightforward manner to a wide
range of image tracking and registration problems. We evaluate our approach by
testing on several benchmark datasets, including the Middlebury and Garg et al.
datasets. In addition, we show application of our method for constructing 3D
Morphable Facial Models from dynamic 3D data
Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective
This paper addresses the task of dense non-rigid structure-from-motion
(NRSfM) using multiple images. State-of-the-art methods to this problem are
often hurdled by scalability, expensive computations, and noisy measurements.
Further, recent methods to NRSfM usually either assume a small number of sparse
feature points or ignore local non-linearities of shape deformations, and thus
cannot reliably model complex non-rigid deformations. To address these issues,
in this paper, we propose a new approach for dense NRSfM by modeling the
problem on a Grassmann manifold. Specifically, we assume the complex non-rigid
deformations lie on a union of local linear subspaces both spatially and
temporally. This naturally allows for a compact representation of the complex
non-rigid deformation over frames. We provide experimental results on several
synthetic and real benchmark datasets. The procured results clearly demonstrate
that our method, apart from being scalable and more accurate than
state-of-the-art methods, is also more robust to noise and generalizes to
highly non-linear deformations.Comment: 10 pages, 7 figure, 4 tables. Accepted for publication in Conference
on Computer Vision and Pattern Recognition (CVPR), 2018, typos fixed and
acknowledgement adde
Joint Optical Flow and Temporally Consistent Semantic Segmentation
The importance and demands of visual scene understanding have been steadily
increasing along with the active development of autonomous systems.
Consequently, there has been a large amount of research dedicated to semantic
segmentation and dense motion estimation. In this paper, we propose a method
for jointly estimating optical flow and temporally consistent semantic
segmentation, which closely connects these two problem domains and leverages
each other. Semantic segmentation provides information on plausible physical
motion to its associated pixels, and accurate pixel-level temporal
correspondences enhance the accuracy of semantic segmentation in the temporal
domain. We demonstrate the benefits of our approach on the KITTI benchmark,
where we observe performance gains for flow and segmentation. We achieve
state-of-the-art optical flow results, and outperform all published algorithms
by a large margin on challenging, but crucial dynamic objects.Comment: 14 pages, Accepted for CVRSUAD workshop at ECCV 201
- …