485 research outputs found
Relightable Neural Human Assets from Multi-view Gradient Illuminations
Human modeling and relighting are two fundamental problems in computer vision
and graphics, where high-quality datasets can largely facilitate related
research. However, most existing human datasets only provide multi-view human
images captured under the same illumination. Although valuable for modeling
tasks, they are not readily used in relighting problems. To promote research in
both fields, in this paper, we present UltraStage, a new 3D human dataset that
contains more than 2,000 high-quality human assets captured under both
multi-view and multi-illumination settings. Specifically, for each example, we
provide 32 surrounding views illuminated with one white light and two gradient
illuminations. In addition to regular multi-view images, gradient illuminations
help recover detailed surface normal and spatially-varying material maps,
enabling various relighting applications. Inspired by recent advances in neural
representation, we further interpret each example into a neural human asset
which allows novel view synthesis under arbitrary lighting conditions. We show
our neural human assets can achieve extremely high capture performance and are
capable of representing fine details such as facial wrinkles and cloth folds.
We also validate UltraStage in single image relighting tasks, training neural
networks with virtual relighted data from neural assets and demonstrating
realistic rendering improvements over prior arts. UltraStage will be publicly
available to the community to stimulate significant future developments in
various human modeling and rendering tasks. The dataset is available at
https://miaoing.github.io/RNHA.Comment: Project page: https://miaoing.github.io/RNH
Photo-Realistic Facial Details Synthesis from Single Image
We present a single-image 3D face synthesis technique that can handle
challenging facial expressions while recovering fine geometric details. Our
technique employs expression analysis for proxy face geometry generation and
combines supervised and unsupervised learning for facial detail synthesis. On
proxy generation, we conduct emotion prediction to determine a new
expression-informed proxy. On detail synthesis, we present a Deep Facial Detail
Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs
both geometry and appearance loss functions. For geometry, we capture 366
high-quality 3D scans from 122 different subjects under 3 facial expressions.
For appearance, we use additional 20K in-the-wild face images and apply
image-based rendering to accommodate lighting variations. Comprehensive
experiments demonstrate that our framework can produce high-quality 3D faces
with realistic details under challenging facial expressions
Towards Practical Capture of High-Fidelity Relightable Avatars
In this paper, we propose a novel framework, Tracking-free Relightable Avatar
(TRAvatar), for capturing and reconstructing high-fidelity 3D avatars. Compared
to previous methods, TRAvatar works in a more practical and efficient setting.
Specifically, TRAvatar is trained with dynamic image sequences captured in a
Light Stage under varying lighting conditions, enabling realistic relighting
and real-time animation for avatars in diverse scenes. Additionally, TRAvatar
allows for tracking-free avatar capture and obviates the need for accurate
surface tracking under varying illumination conditions. Our contributions are
two-fold: First, we propose a novel network architecture that explicitly builds
on and ensures the satisfaction of the linear nature of lighting. Trained on
simple group light captures, TRAvatar can predict the appearance in real-time
with a single forward pass, achieving high-quality relighting effects under
illuminations of arbitrary environment maps. Second, we jointly optimize the
facial geometry and relightable appearance from scratch based on image
sequences, where the tracking is implicitly learned. This tracking-free
approach brings robustness for establishing temporal correspondences between
frames under different lighting conditions. Extensive qualitative and
quantitative experiments demonstrate that our framework achieves superior
performance for photorealistic avatar animation and relighting.Comment: Accepted to SIGGRAPH Asia 2023 (Conference); Project page:
https://travatar-paper.github.io
Differentiable Display Photometric Stereo
Photometric stereo leverages variations in illumination conditions to
reconstruct per-pixel surface normals. The concept of display photometric
stereo, which employs a conventional monitor as an illumination source, has the
potential to overcome limitations often encountered in bulky and
difficult-to-use conventional setups. In this paper, we introduce
Differentiable Display Photometric Stereo (DDPS), a method designed to achieve
high-fidelity normal reconstruction using an off-the-shelf monitor and camera.
DDPS addresses a critical yet often neglected challenge in photometric stereo:
the optimization of display patterns for enhanced normal reconstruction. We
present a differentiable framework that couples basis-illumination image
formation with a photometric-stereo reconstruction method. This facilitates the
learning of display patterns that leads to high-quality normal reconstruction
through automatic differentiation. Addressing the synthetic-real domain gap
inherent in end-to-end optimization, we propose the use of a real-world
photometric-stereo training dataset composed of 3D-printed objects. Moreover,
to reduce the ill-posed nature of photometric stereo, we exploit the linearly
polarized light emitted from the monitor to optically separate diffuse and
specular reflections in the captured images. We demonstrate that DDPS allows
for learning display patterns optimized for a target configuration and is
robust to initialization. We assess DDPS on 3D-printed objects with
ground-truth normals and diverse real-world objects, validating that DDPS
enables effective photometric-stereo reconstruction
Enhancing the Authenticity of Rendered Portraits with Identity-Consistent Transfer Learning
Despite rapid advances in computer graphics, creating high-quality
photo-realistic virtual portraits is prohibitively expensive. Furthermore, the
well-know ''uncanny valley'' effect in rendered portraits has a significant
impact on the user experience, especially when the depiction closely resembles
a human likeness, where any minor artifacts can evoke feelings of eeriness and
repulsiveness. In this paper, we present a novel photo-realistic portrait
generation framework that can effectively mitigate the ''uncanny valley''
effect and improve the overall authenticity of rendered portraits. Our key idea
is to employ transfer learning to learn an identity-consistent mapping from the
latent space of rendered portraits to that of real portraits. During the
inference stage, the input portrait of an avatar can be directly transferred to
a realistic portrait by changing its appearance style while maintaining the
facial identity. To this end, we collect a new dataset, Daz-Rendered-Faces-HQ
(DRFHQ), that is specifically designed for rendering-style portraits. We
leverage this dataset to fine-tune the StyleGAN2 generator, using our carefully
crafted framework, which helps to preserve the geometric and color features
relevant to facial identity. We evaluate our framework using portraits with
diverse gender, age, and race variations. Qualitative and quantitative
evaluations and ablation studies show the advantages of our method compared to
state-of-the-art approaches.Comment: 10 pages, 8 figures, 2 table
Enhancing Mesh Deformation Realism: Dynamic Mesostructure Detailing and Procedural Microstructure Synthesis
Propomos uma solução para gerar dados de mapas de relevo dinâmicos para simular deformações em superfícies macias, com foco na pele humana. A solução incorpora a simulação de rugas ao nível mesoestrutural e utiliza texturas procedurais para adicionar detalhes de microestrutura estáticos. Oferece flexibilidade além da pele humana, permitindo a geração de padrões que imitam deformações em outros materiais macios, como couro, durante a animação.
As soluções existentes para simular rugas e pistas de deformação frequentemente dependem de hardware especializado, que é dispendioso e de difícil acesso. Além disso, depender exclusivamente de dados capturados limita a direção artística e dificulta a adaptação a mudanças. Em contraste, a solução proposta permite a síntese dinâmica de texturas que se adaptam às deformações subjacentes da malha de forma fisicamente plausível.
Vários métodos foram explorados para sintetizar rugas diretamente na geometria, mas sofrem de limitações como auto-interseções e maiores requisitos de armazenamento. A intervenção manual de artistas na criação de mapas de rugas e mapas de tensão permite controle, mas pode ser limitada em deformações complexas ou onde maior realismo seja necessário.
O nosso trabalho destaca o potencial dos métodos procedimentais para aprimorar a geração de padrões de deformação dinâmica, incluindo rugas, com maior controle criativo e sem depender de dados capturados. A incorporação de padrões procedimentais estáticos melhora o realismo, e a abordagem pode ser estendida além da pele para outros materiais macios.We propose a solution for generating dynamic heightmap data to simulate deformations for soft surfaces, with a focus on human skin. The solution incorporates mesostructure-level wrinkles and utilizes procedural textures to add static microstructure details. It offers flexibility beyond human skin, enabling the generation of patterns mimicking deformations in other soft materials, such as leater, during animation.
Existing solutions for simulating wrinkles and deformation cues often rely on specialized hardware, which is costly and not easily accessible. Moreover, relying solely on captured data limits artistic direction and hinders adaptability to changes. In contrast, our proposed solution provides dynamic texture synthesis that adapts to underlying mesh deformations.
Various methods have been explored to synthesize wrinkles directly to the geometry, but they suffer from limitations such as self-intersections and increased storage requirements. Manual intervention by artists using wrinkle maps and tension maps provides control but may be limited to the physics-based simulations.
Our research presents the potential of procedural methods to enhance the generation of dynamic deformation patterns, including wrinkles, with greater creative control and without reliance on captured data. Incorporating static procedural patterns improves realism, and the approach can be extended to other soft-materials beyond skin
-Sampler: An Model Guided Volume Sampling for NeRF
Since being proposed, Neural Radiance Fields (NeRF) have achieved great
success in related tasks, mainly adopting the hierarchical volume sampling
(HVS) strategy for volume rendering. However, the HVS of NeRF approximates
distributions using piecewise constant functions, which provides a relatively
rough estimation. Based on the observation that a well-trained weight function
and the distance between points and the surface have very high
similarity, we propose -Sampler by incorporating the model into
to guide the sampling process. Specifically, we propose to use piecewise
exponential functions rather than piecewise constant functions for
interpolation, which can not only approximate quasi- weight distributions
along rays quite well but also can be easily implemented with few lines of code
without additional computational burden. Stable performance improvements can be
achieved by applying -Sampler to NeRF and its related tasks like 3D
reconstruction. Code is available at https://ustc3dv.github.io/L0-Sampler/ .Comment: Project page: https://ustc3dv.github.io/L0-Sampler
- …