988 research outputs found
Live User-guided Intrinsic Video For Static Scenes
We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection.We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance
NARRATE: A Normal Assisted Free-View Portrait Stylizer
In this work, we propose NARRATE, a novel pipeline that enables
simultaneously editing portrait lighting and perspective in a photorealistic
manner. As a hybrid neural-physical face model, NARRATE leverages complementary
benefits of geometry-aware generative approaches and normal-assisted physical
face models. In a nutshell, NARRATE first inverts the input portrait to a
coarse geometry and employs neural rendering to generate images resembling the
input, as well as producing convincing pose changes. However, inversion step
introduces mismatch, bringing low-quality images with less facial details. As
such, we further estimate portrait normal to enhance the coarse geometry,
creating a high-fidelity physical face model. In particular, we fuse the neural
and physical renderings to compensate for the imperfect inversion, resulting in
both realistic and view-consistent novel perspective images. In relighting
stage, previous works focus on single view portrait relighting but ignoring
consistency between different perspectives as well, leading unstable and
inconsistent lighting effects for view changes. We extend Total Relighting to
fix this problem by unifying its multi-view input normal maps with the physical
face model. NARRATE conducts relighting with consistent normal maps, imposing
cross-view constraints and exhibiting stable and coherent illumination effects.
We experimentally demonstrate that NARRATE achieves more photorealistic,
reliable results over prior works. We further bridge NARRATE with animation and
style transfer tools, supporting pose change, light change, facial animation,
and style transfer, either separately or in combination, all at a photographic
quality. We showcase vivid free-view facial animations as well as 3D-aware
relightable stylization, which help facilitate various AR/VR applications like
virtual cinematography, 3D video conferencing, and post-production.Comment: 14 pages,13 figures https://youtu.be/mP4FV3evmy
What Is Around The Camera?
How much does a single image reveal about the environment it was taken in? In
this paper, we investigate how much of that information can be retrieved from a
foreground object, combined with the background (i.e. the visible part of the
environment). Assuming it is not perfectly diffuse, the foreground object acts
as a complexly shaped and far-from-perfect mirror. An additional challenge is
that its appearance confounds the light coming from the environment with the
unknown materials it is made of. We propose a learning-based approach to
predict the environment from multiple reflectance maps that are computed from
approximate surface normals. The proposed method allows us to jointly model the
statistics of environments and material properties. We train our system from
synthesized training data, but demonstrate its applicability to real-world
data. Interestingly, our analysis shows that the information obtained from
objects made out of multiple materials often is complementary and leads to
better performance.Comment: Accepted to ICCV. Project:
http://homes.esat.kuleuven.be/~sgeorgou/multinatillum
Relighting4D: Neural Relightable Human from Videos
Human relighting is a highly desirable yet challenging task. Existing works
either require expensive one-light-at-a-time (OLAT) captured data using light
stage or cannot freely change the viewpoints of the rendered body. In this
work, we propose a principled framework, Relighting4D, that enables
free-viewpoints relighting from only human videos under unknown illuminations.
Our key insight is that the space-time varying geometry and reflectance of the
human body can be decomposed as a set of neural fields of normal, occlusion,
diffuse, and specular maps. These neural fields are further integrated into
reflectance-aware physically based rendering, where each vertex in the neural
field absorbs and reflects the light from the environment. The whole framework
can be learned from videos in a self-supervised manner, with physically
informed priors designed for regularization. Extensive experiments on both real
and synthetic datasets demonstrate that our framework is capable of relighting
dynamic human actors with free-viewpoints.Comment: ECCV 2022; Project Page
https://frozenburning.github.io/projects/relighting4d Codes are available at
https://github.com/FrozenBurning/Relighting4
Towards Practical Capture of High-Fidelity Relightable Avatars
In this paper, we propose a novel framework, Tracking-free Relightable Avatar
(TRAvatar), for capturing and reconstructing high-fidelity 3D avatars. Compared
to previous methods, TRAvatar works in a more practical and efficient setting.
Specifically, TRAvatar is trained with dynamic image sequences captured in a
Light Stage under varying lighting conditions, enabling realistic relighting
and real-time animation for avatars in diverse scenes. Additionally, TRAvatar
allows for tracking-free avatar capture and obviates the need for accurate
surface tracking under varying illumination conditions. Our contributions are
two-fold: First, we propose a novel network architecture that explicitly builds
on and ensures the satisfaction of the linear nature of lighting. Trained on
simple group light captures, TRAvatar can predict the appearance in real-time
with a single forward pass, achieving high-quality relighting effects under
illuminations of arbitrary environment maps. Second, we jointly optimize the
facial geometry and relightable appearance from scratch based on image
sequences, where the tracking is implicitly learned. This tracking-free
approach brings robustness for establishing temporal correspondences between
frames under different lighting conditions. Extensive qualitative and
quantitative experiments demonstrate that our framework achieves superior
performance for photorealistic avatar animation and relighting.Comment: Accepted to SIGGRAPH Asia 2023 (Conference); Project page:
https://travatar-paper.github.io
- …