19 research outputs found
The Visual Centrifuge: Model-Free Layered Video Representations
True video understanding requires making sense of non-lambertian scenes where
the color of light arriving at the camera sensor encodes information about not
just the last object it collided with, but about multiple mediums -- colored
windows, dirty mirrors, smoke or rain. Layered video representations have the
potential of accurately modelling realistic scenes but have so far required
stringent assumptions on motion, lighting and shape. Here we propose a
learning-based approach for multi-layered video representation: we introduce
novel uncertainty-capturing 3D convolutional architectures and train them to
separate blended videos. We show that these models then generalize to single
videos, where they exhibit interesting abilities: color constancy, factoring
out shadows and separating reflections. We present quantitative and qualitative
results on real world videos.Comment: Appears in: 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2019). This arXiv contains the CVPR Camera Ready version of
the paper (although we have included larger figures) as well as an appendix
detailing the model architectur
RPNR: Robust-Perception Neural Reshading
Augmented Reality (AR) applications necessitates methods of inserting needed
objects into scenes captured by cameras in a way that is coherent with the
surroundings. Common AR applications require the insertion of predefined 3D
objects with known properties and shape. This simplifies the problem since it
is reduced to extracting an illumination model for the object in that scene by
understanding the surrounding light sources. However, it is often not the case
that we have information about the properties of an object, especially when we
depart from a single source image. Our method renders such source fragments in
a coherent way with the target surroundings using only these two images. Our
pipeline uses a Deep Image Prior (DIP) network based on a U-Net architecture as
the main renderer, alongside robust-feature extracting networks that are used
to apply needed losses. Our method does not require any pair-labeled data, and
no extensive training on a dataset. We compare our method using qualitative
metrics to the baseline methods such as Cut and Paste, Cut And Paste Neural
Rendering, and Image HarmonizationComment: 7 page
Fast Fourier Intrinsic Network
We address the problem of decomposing an image into albedo and shading. We
propose the Fast Fourier Intrinsic Network, FFI-Net in short, that operates in
the spectral domain, splitting the input into several spectral bands. Weights
in FFI-Net are optimized in the spectral domain, allowing faster convergence to
a lower error. FFI-Net is lightweight and does not need auxiliary networks for
training. The network is trained end-to-end with a novel spectral loss which
measures the global distance between the network prediction and corresponding
ground truth. FFI-Net achieves state-of-the-art performance on MPI-Sintel, MIT
Intrinsic, and IIW datasets.Comment: WACV 2021 - camera read
Estimating Reflectance Layer from A Single Image: Integrating Reflectance Guidance and Shadow/Specular Aware Learning
Estimating reflectance layer from a single image is a challenging task. It
becomes more challenging when the input image contains shadows or specular
highlights, which often render an inaccurate estimate of the reflectance layer.
Therefore, we propose a two-stage learning method, including reflectance
guidance and a Shadow/Specular-Aware (S-Aware) network to tackle the problem.
In the first stage, an initial reflectance layer free from shadows and
specularities is obtained with the constraint of novel losses that are guided
by prior-based shadow-free and specular-free images. To further enforce the
reflectance layer to be independent from shadows and specularities in the
second-stage refinement, we introduce an S-Aware network that distinguishes the
reflectance image from the input image. Our network employs a classifier to
categorize shadow/shadow-free, specular/specular-free classes, enabling the
activation features to function as attention maps that focus on shadow/specular
regions. Our quantitative and qualitative evaluations show that our method
outperforms the state-of-the-art methods in the reflectance layer estimation
that is free from shadows and specularities.Comment: Accepted to AAAI202
DIP: Differentiable Interreflection-aware Physics-based Inverse Rendering
We present a physics-based inverse rendering method that learns the
illumination, geometry, and materials of a scene from posed multi-view RGB
images. To model the illumination of a scene, existing inverse rendering works
either completely ignore the indirect illumination or model it by coarse
approximations, leading to sub-optimal illumination, geometry, and material
prediction of the scene. In this work, we propose a physics-based illumination
model that explicitly traces the incoming indirect lights at each surface point
based on interreflection, followed by estimating each identified indirect light
through an efficient neural network. Furthermore, we utilize the Leibniz's
integral rule to resolve non-differentiability in the proposed illumination
model caused by one type of environment light -- the tangent lights. As a
result, the proposed interreflection-aware illumination model can be learned
end-to-end together with geometry and materials estimation. As a side product,
our physics-based inverse rendering model also facilitates flexible and
realistic material editing as well as relighting. Extensive experiments on both
synthetic and real-world datasets demonstrate that the proposed method performs
favorably against existing inverse rendering methods on novel view synthesis
and inverse rendering