1,555 research outputs found
Coherent Intrinsic Images from Photo Collections
International audienceAn intrinsic image is a decomposition of a photo into an illumination layer and a reflectance layer, which enables powerful editing such as the alteration of an object's material independently of its illumination. However, decomposing a single photo is highly under-constrained and existing methods require user assistance or handle only simple scenes. In this paper, we compute intrinsic decompositions using several images of the same scene under different viewpoints and lighting conditions. We use multi-view stereo to automatically reconstruct 3D points and normals from which we derive relationships between reflectance values at different locations, across multiple views and consequently different lighting conditions. We use robust estimation to reliably identify reflectance ratios between pairs of points. From these, we infer constraints for our optimization and enforce a coherent solution across multiple views and illuminations. Our results demonstrate that this constrained optimization yields high-quality and coherent intrinsic decompositions of complex scenes. We illustrate how these decompositions can be used for image-based illumination transfer and transitions between views with consistent lighting
Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences
Machine learning based Single Image Intrinsic Decomposition (SIID) methods
decompose a captured scene into its albedo and shading images by using the
knowledge of a large set of known and realistic ground truth decompositions.
Collecting and annotating such a dataset is an approach that cannot scale to
sufficient variety and realism. We free ourselves from this limitation by
training on unannotated images.
Our method leverages the observation that two images of the same scene but
with different lighting provide useful information on their intrinsic
properties: by definition, albedo is invariant to lighting conditions, and
cross-combining the estimated albedo of a first image with the estimated
shading of a second one should lead back to the second one's input image. We
transcribe this relationship into a siamese training scheme for a deep
convolutional neural network that decomposes a single image into albedo and
shading. The siamese setting allows us to introduce a new loss function
including such cross-combinations, and to train solely on (time-lapse) images,
discarding the need for any ground truth annotations.
As a result, our method has the good properties of i) taking advantage of the
time-varying information of image sequences in the (pre-computed) training
step, ii) not requiring ground truth data to train on, and iii) being able to
decompose single images of unseen scenes at runtime. To demonstrate and
evaluate our work, we additionally propose a new rendered dataset containing
illumination-varying scenes and a set of quantitative metrics to evaluate SIID
algorithms. Despite its unsupervised nature, our results compete with state of
the art methods, including supervised and non data-driven methods.Comment: To appear in Pacific Graphics 201
CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering
Intrinsic image decomposition is a challenging, long-standing computer vision
problem for which ground truth data is very difficult to acquire. We explore
the use of synthetic data for training CNN-based intrinsic image decomposition
models, then applying these learned models to real-world images. To that end,
we present \ICG, a new, large-scale dataset of physically-based rendered images
of scenes with full ground truth decompositions. The rendering process we use
is carefully designed to yield high-quality, realistic images, which we find to
be crucial for this problem domain. We also propose a new end-to-end training
method that learns better decompositions by leveraging \ICG, and optionally IIW
and SAW, two recent datasets of sparse annotations on real-world images.
Surprisingly, we find that a decomposition network trained solely on our
synthetic data outperforms the state-of-the-art on both IIW and SAW, and
performance improves even further when IIW and SAW data is added during
training. Our work demonstrates the suprising effectiveness of
carefully-rendered synthetic data for the intrinsic images task.Comment: Paper for 'CGIntrinsics: Better Intrinsic Image Decomposition through
Physically-Based Rendering' published in ECCV, 201
Live User-guided Intrinsic Video For Static Scenes
We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection.We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance
Reflectance Adaptive Filtering Improves Intrinsic Image Estimation
Separating an image into reflectance and shading layers poses a challenge for
learning approaches because no large corpus of precise and realistic ground
truth decompositions exists. The Intrinsic Images in the Wild~(IIW) dataset
provides a sparse set of relative human reflectance judgments, which serves as
a standard benchmark for intrinsic images. A number of methods use IIW to learn
statistical dependencies between the images and their reflectance layer.
Although learning plays an important role for high performance, we show that a
standard signal processing technique achieves performance on par with current
state-of-the-art. We propose a loss function for CNN learning of dense
reflectance predictions. Our results show a simple pixel-wise decision, without
any context or prior knowledge, is sufficient to provide a strong baseline on
IIW. This sets a competitive baseline which only two other approaches surpass.
We then develop a joint bilateral filtering method that implements strong prior
knowledge about reflectance constancy. This filtering operation can be applied
to any intrinsic image algorithm and we improve several previous results
achieving a new state-of-the-art on IIW. Our findings suggest that the effect
of learning-based approaches may have been over-estimated so far. Explicit
prior knowledge is still at least as important to obtain high performance in
intrinsic image decompositions.Comment: CVPR 201
Improvement of PolSAR Decomposition Scattering Powers Using a Relative Decorrelation Measure
In this letter, a methodology is proposed to improve the scattering powers
obtained from model-based decomposition using Polarimetric Synthetic Aperture
Radar (PolSAR) data. The novelty of this approach lies in utilizing the
intrinsic information in the off-diagonal elements of the 33 coherency
matrix represented in the form of complex correlation
coefficients. Two complex correlation coefficients are computed between
co-polarization and cross-polarization components of the Pauli scattering
vector. The difference between modulus of complex correlation coefficients
corresponding to (i.e. the degree of polarization
(DOP) optimized coherency matrix), and (original) matrices is
obtained. Then a suitable scaling is performed using fractions \emph{i.e.,}
obtained
from the diagonal elements of the matrix.
Thereafter, these new quantities are used in modifying the Yamaguchi
4-component scattering powers obtained from . To
corroborate the fact that these quantities have physical relevance, a
quantitative analysis of these for the L-band AIRSAR San Francisco and the
L-band Kyoto images is illustrated. Finally, the scattering powers obtained
from the proposed methodology are compared with the corresponding powers
obtained from the Yamaguchi \emph{et. al.,} 4-component (Y4O) decomposition and
the Yamaguchi \emph{et. al.,} 4-component Rotated (Y4R) decomposition for the
same data sets. The proportion of negative power pixels is also computed. The
results show an improvement on all these attributes by using the proposed
methodology.Comment: Accepted for publication in Remote Sensing Letter
Neural Face Editing with Intrinsic Image Disentangling
Traditional face editing methods often require a number of sophisticated and
task specific algorithms to be applied one after the other --- a process that
is tedious, fragile, and computationally intensive. In this paper, we propose
an end-to-end generative adversarial network that infers a face-specific
disentangled representation of intrinsic face properties, including shape (i.e.
normals), albedo, and lighting, and an alpha matte. We show that this network
can be trained on "in-the-wild" images by incorporating an in-network
physically-based image formation module and appropriate loss functions. Our
disentangling latent representation allows for semantically relevant edits,
where one aspect of facial appearance can be manipulated while keeping
orthogonal properties fixed, and we demonstrate its use for a number of facial
editing applications.Comment: CVPR 2017 ora
Plausible Shading Decomposition For Layered Photo Retouching
Photographers routinely compose multiple manipulated photos of the same scene (layers) into a single image, which is better than any individual photo could be alone. Similarly, 3D artists set up rendering systems to produce layered images to contain only individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, both approaches either take considerable time to capture, or remain limited to synthetic scenes. In this paper, we suggest a system to allow decomposing a single image into a plausible shading decomposition (PSD) that approximates effects such as shadow, diffuse illumination, albedo, and specular shading. This decomposition can then be manipulated in any off-the-shelf image manipulation software and recomposited back. We perform such a decomposition by learning a convolutional neural network trained using synthetic data. We demonstrate the effectiveness of our decomposition on synthetic (i.e., rendered) and real data (i.e., photographs), and use them for common photo manipulation, which are nearly impossible to perform otherwise from single images
- …