8,169 research outputs found
Analysis and approximation of some Shape-from-Shading models for non-Lambertian surfaces
The reconstruction of a 3D object or a scene is a classical inverse problem
in Computer Vision. In the case of a single image this is called the
Shape-from-Shading (SfS) problem and it is known to be ill-posed even in a
simplified version like the vertical light source case. A huge number of works
deals with the orthographic SfS problem based on the Lambertian reflectance
model, the most common and simplest model which leads to an eikonal type
equation when the light source is on the vertical axis. In this paper we want
to study non-Lambertian models since they are more realistic and suitable
whenever one has to deal with different kind of surfaces, rough or specular. We
will present a unified mathematical formulation of some popular orthographic
non-Lambertian models, considering vertical and oblique light directions as
well as different viewer positions. These models lead to more complex
stationary nonlinear partial differential equations of Hamilton-Jacobi type
which can be regarded as the generalization of the classical eikonal equation
corresponding to the Lambertian case. However, all the equations corresponding
to the models considered here (Oren-Nayar and Phong) have a similar structure
so we can look for weak solutions to this class in the viscosity solution
framework. Via this unified approach, we are able to develop a semi-Lagrangian
approximation scheme for the Oren-Nayar and the Phong model and to prove a
general convergence result. Numerical simulations on synthetic and real images
will illustrate the effectiveness of this approach and the main features of the
scheme, also comparing the results with previous results in the literature.Comment: Accepted version to Journal of Mathematical Imaging and Vision, 57
page
Joint Learning of Intrinsic Images and Semantic Segmentation
Semantic segmentation of outdoor scenes is problematic when there are
variations in imaging conditions. It is known that albedo (reflectance) is
invariant to all kinds of illumination effects. Thus, using reflectance images
for semantic segmentation task can be favorable. Additionally, not only
segmentation may benefit from reflectance, but also segmentation may be useful
for reflectance computation. Therefore, in this paper, the tasks of semantic
segmentation and intrinsic image decomposition are considered as a combined
process by exploring their mutual relationship in a joint fashion. To that end,
we propose a supervised end-to-end CNN architecture to jointly learn intrinsic
image decomposition and semantic segmentation. We analyze the gains of
addressing those two problems jointly. Moreover, new cascade CNN architectures
for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as
single tasks. Furthermore, a dataset of 35K synthetic images of natural
environments is created with corresponding albedo and shading (intrinsics), as
well as semantic labels (segmentation) assigned to each object/scene. The
experiments show that joint learning of intrinsic image decomposition and
semantic segmentation is beneficial for both tasks for natural scenes. Dataset
and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201
Learning to Reconstruct Texture-less Deformable Surfaces from a Single View
Recent years have seen the development of mature solutions for reconstructing
deformable surfaces from a single image, provided that they are relatively
well-textured. By contrast, recovering the 3D shape of texture-less surfaces
remains an open problem, and essentially relates to Shape-from-Shading. In this
paper, we introduce a data-driven approach to this problem. We introduce a
general framework that can predict diverse 3D representations, such as meshes,
normals, and depth maps. Our experiments show that meshes are ill-suited to
handle texture-less 3D reconstruction in our context. Furthermore, we
demonstrate that our approach generalizes well to unseen objects, and that it
yields higher-quality reconstructions than a state-of-the-art SfS technique,
particularly in terms of normal estimates. Our reconstructions accurately model
the fine details of the surfaces, such as the creases of a T-Shirt worn by a
person.Comment: Accepted to 3DV 201
Terrain analysis using radar shape-from-shading
This paper develops a maximum a posteriori (MAP) probability estimation framework for shape-from-shading (SFS) from synthetic aperture radar (SAR) images. The aim is to use this method to reconstruct surface topography from a single radar image of relatively complex terrain. Our MAP framework makes explicit how the recovery of local surface orientation depends on the whereabouts of terrain edge features and the available radar reflectance information. To apply the resulting process to real world radar data, we require probabilistic models for the appearance of terrain features and the relationship between the orientation of surface normals and the radar reflectance. We show that the SAR data can be modeled using a Rayleigh-Bessel distribution and use this distribution to develop a maximum likelihood algorithm for detecting and labeling terrain edge features. Moreover, we show how robust statistics can be used to estimate the characteristic parameters of this distribution. We also develop an empirical model for the SAR reflectance function. Using the reflectance model, we perform Lambertian correction so that a conventional SFS algorithm can be applied to the radar data. The initial surface normal direction is constrained to point in the direction of the nearest ridge or ravine feature. Each surface normal must fall within a conical envelope whose axis is in the direction of the radar illuminant. The extent of the envelope depends on the corrected radar reflectance and the variance of the radar signal statistics. We explore various ways of smoothing the field of surface normals using robust statistics. Finally, we show how to reconstruct the terrain surface from the smoothed field of surface normal vectors. The proposed algorithm is applied to various SAR data sets containing relatively complex terrain structure
Live User-guided Intrinsic Video For Static Scenes
We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection.We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance
Recovering facial shape using a statistical model of surface normal direction
In this paper, we show how a statistical model of facial shape can be embedded within a shape-from-shading algorithm. We describe how facial shape can be captured using a statistical model of variations in surface normal direction. To construct this model, we make use of the azimuthal equidistant projection to map the distribution of surface normals from the polar representation on a unit sphere to Cartesian points on a local tangent plane. The distribution of surface normal directions is captured using the covariance matrix for the projected point positions. The eigenvectors of the covariance matrix define the modes of shape-variation in the fields of transformed surface normals. We show how this model can be trained using surface normal data acquired from range images and how to fit the model to intensity images of faces using constraints on the surface normal direction provided by Lambert's law. We demonstrate that the combination of a global statistical constraint and local irradiance constraint yields an efficient and accurate approach to facial shape recovery and is capable of recovering fine local surface details. We assess the accuracy of the technique on a variety of images with ground truth and real-world images
Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz
The reconstruction of dense 3D models of face geometry and appearance from a
single image is highly challenging and ill-posed. To constrain the problem,
many approaches rely on strong priors, such as parametric face models learned
from limited 3D scan data. However, prior models restrict generalization of the
true diversity in facial geometry, skin reflectance and illumination. To
alleviate this problem, we present the first approach that jointly learns 1) a
regressor for face shape, expression, reflectance and illumination on the basis
of 2) a concurrently learned parametric face model. Our multi-level face model
combines the advantage of 3D Morphable Models for regularization with the
out-of-space generalization of a learned corrective space. We train end-to-end
on in-the-wild images without dense annotations by fusing a convolutional
encoder with a differentiable expert-designed renderer and a self-supervised
training loss, both defined at multiple detail levels. Our approach compares
favorably to the state-of-the-art in terms of reconstruction quality, better
generalizes to real world faces, and runs at over 250 Hz.Comment: CVPR 2018 (Oral). Project webpage:
https://gvv.mpi-inf.mpg.de/projects/FML
- …