6,840 research outputs found
NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination
We address the problem of recovering the shape and spatially-varying
reflectance of an object from multi-view images (and their camera poses) of an
object illuminated by one unknown lighting condition. This enables the
rendering of novel views of the object under arbitrary environment lighting and
editing of the object's material properties. The key to our approach, which we
call Neural Radiance Factorization (NeRFactor), is to distill the volumetric
geometry of a Neural Radiance Field (NeRF) [Mildenhall et al. 2020]
representation of the object into a surface representation and then jointly
refine the geometry while solving for the spatially-varying reflectance and
environment lighting. Specifically, NeRFactor recovers 3D neural fields of
surface normals, light visibility, albedo, and Bidirectional Reflectance
Distribution Functions (BRDFs) without any supervision, using only a
re-rendering loss, simple smoothness priors, and a data-driven BRDF prior
learned from real-world BRDF measurements. By explicitly modeling light
visibility, NeRFactor is able to separate shadows from albedo and synthesize
realistic soft or hard shadows under arbitrary lighting conditions. NeRFactor
is able to recover convincing 3D models for free-viewpoint relighting in this
challenging and underconstrained capture setup for both synthetic and real
scenes. Qualitative and quantitative experiments show that NeRFactor
outperforms classic and deep learning-based state of the art across various
tasks. Our videos, code, and data are available at
people.csail.mit.edu/xiuming/projects/nerfactor/.Comment: Camera-ready version for SIGGRAPH Asia 2021. Project Page:
https://people.csail.mit.edu/xiuming/projects/nerfactor
OutCast: Outdoor Single-image Relighting with Cast Shadows
We propose a relighting method for outdoor images. Our method mainly focuses
on predicting cast shadows in arbitrary novel lighting directions from a single
image while also accounting for shading and global effects such the sun light
color and clouds. Previous solutions for this problem rely on reconstructing
occluder geometry, e.g. using multi-view stereo, which requires many images of
the scene. Instead, in this work we make use of a noisy off-the-shelf
single-image depth map estimation as a source of geometry. Whilst this can be a
good guide for some lighting effects, the resulting depth map quality is
insufficient for directly ray-tracing the shadows. Addressing this, we propose
a learned image space ray-marching layer that converts the approximate depth
map into a deep 3D representation that is fused into occlusion queries using a
learned traversal. Our proposed method achieves, for the first time,
state-of-the-art relighting results, with only a single image as input. For
supplementary material visit our project page at:
https://dgriffiths.uk/outcast.Comment: Eurographics 2022 - Accepte
Photorealistic retrieval of occluded facial information using a performance-driven face model
Facial occlusions can cause both human observers and computer algorithms
to fail in a variety of important tasks such as facial action analysis and
expression classification. This is because the missing information is not
reconstructed accurately enough for the purpose of the task in hand. Most
current computer methods that are used to tackle this problem implement
complex three-dimensional polygonal face models that are generally timeconsuming
to produce and unsuitable for photorealistic reconstruction of
missing facial features and behaviour.
In this thesis, an image-based approach is adopted to solve the occlusion
problem. A dynamic computer model of the face is used to retrieve the
occluded facial information from the driver faces. The model consists of a
set of orthogonal basis actions obtained by application of principal
component analysis (PCA) on image changes and motion fields extracted
from a sequence of natural facial motion (Cowe 2003). Examples of
occlusion affected facial behaviour can then be projected onto the model to
compute coefficients of the basis actions and thus produce photorealistic
performance-driven animations.
Visual inspection shows that the PCA face model recovers aspects of
expressions in those areas occluded in the driver sequence, but the expression is generally muted. To further investigate this finding, a database
of test sequences affected by a considerable set of artificial and natural
occlusions is created. A number of suitable metrics is developed to measure
the accuracy of the reconstructions. Regions of the face that are most
important for performance-driven mimicry and that seem to carry the best
information about global facial configurations are revealed using Bubbles,
thus in effect identifying facial areas that are most sensitive to occlusions.
Recovery of occluded facial information is enhanced by applying an
appropriate scaling factor to the respective coefficients of the basis actions
obtained by PCA. This method improves the reconstruction of the facial
actions emanating from the occluded areas of the face. However, due to the
fact that PCA produces bases that encode composite, correlated actions,
such an enhancement also tends to affect actions in non-occluded areas of
the face. To avoid this, more localised controls for facial actions are
produced using independent component analysis (ICA). Simple projection
of the data onto an ICA model is not viable due to the non-orthogonality of
the extracted bases. Thus occlusion-affected mimicry is first generated using
the PCA model and then enhanced by accordingly manipulating the
independent components that are subsequently extracted from the mimicry.
This combination of methods yields significant improvements and results in
photorealistic reconstructions of occluded facial actions
Extracting Triangular 3D Models, Materials, and Lighting From Images
We present an efficient method for joint optimization of topology, materials
and lighting from multi-view image observations. Unlike recent multi-view
reconstruction approaches, which typically produce entangled 3D representations
encoded in neural networks, we output triangle meshes with spatially-varying
materials and environment lighting that can be deployed in any traditional
graphics engine unmodified. We leverage recent work in differentiable
rendering, coordinate-based networks to compactly represent volumetric
texturing, alongside differentiable marching tetrahedrons to enable
gradient-based optimization directly on the surface mesh. Finally, we introduce
a differentiable formulation of the split sum approximation of environment
lighting to efficiently recover all-frequency lighting. Experiments show our
extracted models used in advanced scene editing, material decomposition, and
high quality view interpolation, all running at interactive rates in
triangle-based renderers (rasterizers and path tracers). Project website:
https://nvlabs.github.io/nvdiffrec/ .Comment: Project website: https://nvlabs.github.io/nvdiffrec
- …