4,746 research outputs found
Semi-Global Stereo Matching with Surface Orientation Priors
Semi-Global Matching (SGM) is a widely-used efficient stereo matching
technique. It works well for textured scenes, but fails on untextured slanted
surfaces due to its fronto-parallel smoothness assumption. To remedy this
problem, we propose a simple extension, termed SGM-P, to utilize precomputed
surface orientation priors. Such priors favor different surface slants in
different 2D image regions or 3D scene regions and can be derived in various
ways. In this paper we evaluate plane orientation priors derived from stereo
matching at a coarser resolution and show that such priors can yield
significant performance gains for difficult weakly-textured scenes. We also
explore surface normal priors derived from Manhattan-world assumptions, and we
analyze the potential performance gains using oracle priors derived from
ground-truth data. SGM-P only adds a minor computational overhead to SGM and is
an attractive alternative to more complex methods employing higher-order
smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape
Identifying the lights position in photometric stereo under unknown lighting
Reconstructing the 3D shape of an object from a set of images is a classical
problem in Computer Vision. Photometric stereo is one of the possible
approaches. It stands on the assumption that the object is observed from a
fixed point of view under different lighting conditions. The traditional
approach requires that the position of the light sources is accurately known.
It has been proved that the lights position can be estimated directly from the
data, when at least 6 images of the observed object are available. In this
paper, we give a Matlab implementation of the algorithm for solving the
photometric stereo problem under unknown lighting, and propose a simple
shooting technique to solve the bas-relief ambiguity.Comment: new versio
Photometric Stereo by Hemispherical Metric Embedding
Photometric Stereo methods seek to reconstruct the 3d shape of an object from
motionless images obtained with varying illumination. Most existing methods
solve a restricted problem where the physical reflectance model, such as
Lambertian reflectance, is known in advance. In contrast, we do not restrict
ourselves to a specific reflectance model. Instead, we offer a method that
works on a wide variety of reflectances. Our approach uses a simple yet
uncommonly used property of the problem - the sought after normals are points
on a unit hemisphere. We present a novel embedding method that maps pixels to
normals on the unit hemisphere. Our experiments demonstrate that this approach
outperforms existing manifold learning methods for the task of hemisphere
embedding. We further show successful reconstructions of objects from a wide
variety of reflectances including smooth, rough, diffuse and specular surfaces,
even in the presence of significant attached shadows. Finally, we empirically
prove that under these challenging settings we obtain more accurate shape
reconstructions than existing methods
The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning
The three-dimensional reconstruction of scenes from multiple views has made
impressive strides in recent years, chiefly by methods correlating isolated
feature points, intensities, or curvilinear structure. In the general setting,
i.e., without requiring controlled acquisition, limited number of objects,
abundant patterns on objects, or object curves to follow particular models, the
majority of these methods produce unorganized point clouds, meshes, or voxel
representations of the reconstructed scene, with some exceptions producing 3D
drawings as networks of curves. Many applications, e.g., robotics, urban
planning, industrial design, and hard surface modeling, however, require
structured representations which make explicit 3D curves, surfaces, and their
spatial relationships. Reconstructing surface representations can now be
constrained by the 3D drawing acting like a scaffold to hang on the computed
representations, leading to increased robustness and quality of reconstruction.
This paper presents one way of completing such 3D drawings with surface
reconstructions, by exploring occlusion reasoning through lofting algorithms.Comment: CVPR 2017 expanded version with improvements over camera ready,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
CVPR, 201
Three-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator
Over the past decade, significant efforts have been made to improve the
trade-off between speed and accuracy of surface normal estimators (SNEs). This
paper introduces an accurate and ultrafast SNE for structured range data. The
proposed approach computes surface normals by simply performing three filtering
operations, namely, two image gradient filters (in horizontal and vertical
directions, respectively) and a mean/median filter, on an inverse depth image
or a disparity image. Despite the simplicity of the method, no similar method
already exists in the literature. In our experiments, we created three
large-scale synthetic datasets (easy, medium and hard) using 24 3-dimensional
(3D) mesh models. Each mesh model is used to generate 1800--2500 pairs of
480x640 pixel depth images and the corresponding surface normal ground truth
from different views. The average angular errors with respect to the easy,
medium and hard datasets are 1.6 degrees, 5.6 degrees and 15.3 degrees,
respectively. Our C++ and CUDA implementations achieve a processing speed of
over 260 Hz and 21 kHz, respectively. Our proposed SNE achieves a better
overall performance than all other existing computer vision-based SNEs. Our
datasets and source code are publicly available at: sites.google.com/view/3f2n
Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction
Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view
images is a fundamental yet active research area in computer vision. Despite
the steady progress in multi-view stereo reconstruction, most existing methods
are still limited in recovering fine-scale details and sharp features while
suppressing noises, and may fail in reconstructing regions with few textures.
To address these limitations, this paper presents a Detail-preserving and
Content-aware Variational (DCV) multi-view stereo method, which reconstructs
the 3D surface by alternating between reprojection error minimization and mesh
denoising. In reprojection error minimization, we propose a novel inter-image
similarity measure, which is effective to preserve fine-scale details of the
reconstructed surface and builds a connection between guided image filtering
and image registration. In mesh denoising, we propose a content-aware
-minimization algorithm by adaptively estimating the value and
regularization parameters based on the current input. It is much more promising
in suppressing noise while preserving sharp features than conventional
isotropic mesh smoothing. Experimental results on benchmark datasets
demonstrate that our DCV method is capable of recovering more surface details,
and obtains cleaner and more accurate reconstructions than state-of-the-art
methods. In particular, our method achieves the best results among all
published methods on the Middlebury dino ring and dino sparse ring datasets in
terms of both completeness and accuracy.Comment: 14 pages,16 figures. Submitted to IEEE Transaction on image
processin
Fast and Robust Fixed-Rank Matrix Recovery
We address the problem of efficient sparse fixed-rank (S-FR) matrix
decomposition, i.e., splitting a corrupted matrix into an uncorrupted
matrix of rank and a sparse matrix of outliers . Fixed-rank
constraints are usually imposed by the physical restrictions of the system
under study. Here we propose a method to perform accurate and very efficient
S-FR decomposition that is more suitable for large-scale problems than existing
approaches. Our method is a grateful combination of geometrical and algebraical
techniques, which avoids the bottleneck caused by the Truncated SVD (TSVD).
Instead, a polar factorization is used to exploit the manifold structure of
fixed-rank problems as the product of two Stiefel and an SPD manifold, leading
to a better convergence and stability. Then, closed-form projectors help to
speed up each iteration of the method. We introduce a novel and fast projector
for the manifold and a proof of its validity. Further acceleration
is achieved using a Nystrom scheme. Extensive experiments with synthetic and
real data in the context of robust photometric stereo and spectral clustering
show that our proposals outperform the state of the art
ResDepth: Learned Residual Stereo Reconstruction
We propose an embarrassingly simple but very effective scheme for
high-quality dense stereo reconstruction: (i) generate an approximate
reconstruction with your favourite stereo matcher; (ii) rewarp the input images
with that approximate model; (iii) with the initial reconstruction and the
warped images as input, train a deep network to enhance the reconstruction by
regressing a residual correction; and (iv) if desired, iterate the refinement
with the new, improved reconstruction. The strategy to only learn the residual
greatly simplifies the learning problem. A standard Unet without bells and
whistles is enough to reconstruct even small surface details, like dormers and
roof substructures in satellite images. We also investigate residual
reconstruction with less information and find that even a single image is
enough to greatly improve an approximate reconstruction. Our full model reduces
the mean absolute error of state-of-the-art stereo reconstruction systems by
>50%, both in our target domain of satellite stereo and on stereo pairs from
the ETH3D benchmark.Comment: updated supplementary materia
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
ALIGNet: Partial-Shape Agnostic Alignment via Unsupervised Learning
The process of aligning a pair of shapes is a fundamental operation in
computer graphics. Traditional approaches rely heavily on matching
corresponding points or features to guide the alignment, a paradigm that
falters when significant shape portions are missing. These techniques generally
do not incorporate prior knowledge about expected shape characteristics, which
can help compensate for any misleading cues left by inaccuracies exhibited in
the input shapes. We present an approach based on a deep neural network,
leveraging shape datasets to learn a shape-aware prior for source-to-target
alignment that is robust to shape incompleteness. In the absence of ground
truth alignments for supervision, we train a network on the task of shape
alignment using incomplete shapes generated from full shapes for
self-supervision. Our network, called ALIGNet, is trained to warp complete
source shapes to incomplete targets, as if the target shapes were complete,
thus essentially rendering the alignment partial-shape agnostic. We aim for the
network to develop specialized expertise over the common characteristics of the
shapes in each dataset, thereby achieving a higher-level understanding of the
expected shape space to which a local approach would be oblivious. We constrain
ALIGNet through an anisotropic total variation identity regularization to
promote piecewise smooth deformation fields, facilitating both partial-shape
agnosticism and post-deformation applications. We demonstrate that ALIGNet
learns to align geometrically distinct shapes, and is able to infer plausible
mappings even when the target shape is significantly incomplete. We show that
our network learns the common expected characteristics of shape collections,
without over-fitting or memorization, enabling it to produce plausible
deformations on unseen data during test time.Comment: To be presented at SIGGRAPH Asia 201
- …