64 research outputs found

    Relightable Neural Human Assets from Multi-view Gradient Illuminations

    Full text link
    Human modeling and relighting are two fundamental problems in computer vision and graphics, where high-quality datasets can largely facilitate related research. However, most existing human datasets only provide multi-view human images captured under the same illumination. Although valuable for modeling tasks, they are not readily used in relighting problems. To promote research in both fields, in this paper, we present UltraStage, a new 3D human dataset that contains more than 2,000 high-quality human assets captured under both multi-view and multi-illumination settings. Specifically, for each example, we provide 32 surrounding views illuminated with one white light and two gradient illuminations. In addition to regular multi-view images, gradient illuminations help recover detailed surface normal and spatially-varying material maps, enabling various relighting applications. Inspired by recent advances in neural representation, we further interpret each example into a neural human asset which allows novel view synthesis under arbitrary lighting conditions. We show our neural human assets can achieve extremely high capture performance and are capable of representing fine details such as facial wrinkles and cloth folds. We also validate UltraStage in single image relighting tasks, training neural networks with virtual relighted data from neural assets and demonstrating realistic rendering improvements over prior arts. UltraStage will be publicly available to the community to stimulate significant future developments in various human modeling and rendering tasks. The dataset is available at https://miaoing.github.io/RNHA.Comment: Project page: https://miaoing.github.io/RNH

    Recovering refined surface normals for relighting clothing in dynamic scenes

    Get PDF
    In this paper we present a method to relight captured 3D video sequences of non-rigid, dynamic scenes, such as clothing of real actors, reconstructed from multiple view video. A view-dependent approach is introduced to refine an initial coarse surface reconstruction using shape-from-shading to estimate detailed surface normals. The prior surface approximation is used to constrain the simultaneous estimation of surface normals and scene illumination, under the assumption of Lambertian surface reflectance. This approach enables detailed surface normals of a moving non-rigid object to be estimated from a single image frame. Refined normal estimates from multiple views are integrated into a single surface normal map. This approach allows highly non-rigid surfaces, such as creases in clothing, to be relit whilst preserving the detailed dynamics observed in video

    LumiGAN: Unconditional Generation of Relightable 3D Human Faces

    Full text link
    Unsupervised learning of 3D human faces from unstructured 2D image data is an active research area. While recent works have achieved an impressive level of photorealism, they commonly lack control of lighting, which prevents the generated assets from being deployed in novel environments. To this end, we introduce LumiGAN, an unconditional Generative Adversarial Network (GAN) for 3D human faces with a physically based lighting module that enables relighting under novel illumination at inference time. Unlike prior work, LumiGAN can create realistic shadow effects using an efficient visibility formulation that is learned in a self-supervised manner. LumiGAN generates plausible physical properties for relightable faces, including surface normals, diffuse albedo, and specular tint without any ground truth data. In addition to relightability, we demonstrate significantly improved geometry generation compared to state-of-the-art non-relightable 3D GANs and notably better photorealism than existing relightable GANs.Comment: Project page: https://boyangdeng.com/projects/lumiga

    ILSH: The Imperial Light-Stage Head Dataset for Human Head View Synthesis

    Full text link
    This paper introduces the Imperial Light-Stage Head (ILSH) dataset, a novel light-stage-captured human head dataset designed to support view synthesis academic challenges for human heads. The ILSH dataset is intended to facilitate diverse approaches, such as scene-specific or generic neural rendering, multiple-view geometry, 3D vision, and computer graphics, to further advance the development of photo-realistic human avatars. This paper details the setup of a light-stage specifically designed to capture high-resolution (4K) human head images and describes the process of addressing challenges (preprocessing, ethical issues) in collecting high-quality data. In addition to the data collection, we address the split of the dataset into train, validation, and test sets. Our goal is to design and support a fair view synthesis challenge task for this novel dataset, such that a similar level of performance can be maintained and expected when using the test set, as when using the validation set. The ILSH dataset consists of 52 subjects captured using 24 cameras with all 82 lighting sources turned on, resulting in a total of 1,248 close-up head images, border masks, and camera pose pairs.Comment: ICCV 2023 Workshop, 9 pages, 6 figure

    Evaluation and Optimization of Rendering Techniques for Autonomous Driving Simulation

    Full text link
    In order to meet the demand for higher scene rendering quality from some autonomous driving teams (such as those focused on CV), we have decided to use an offline simulation industrial rendering framework instead of real-time rendering in our autonomous driving simulator. Our plan is to generate lower-quality scenes using a game engine, extract them, and then use an IQA algorithm to validate the improvement in scene quality achieved through offline rendering. The improved scenes will then be used for training

    Intrinsic Textures for Relightable Free-Viewpoint Video

    Get PDF
    This paper presents an approach to estimate the intrinsic texture properties (albedo, shading, normal) of scenes from multiple view acquisition under unknown illumination conditions. We introduce the concept of intrinsic textures, which are pixel-resolution surface textures representing the intrinsic appearance parameters of a scene. Unlike previous video relighting methods, the approach does not assume regions of uniform albedo, which makes it applicable to richly textured scenes. We show that intrinsic image methods can be used to refine an initial, low-frequency shading estimate based on a global lighting reconstruction from an original texture and coarse scene geometry in order to resolve the inherent global ambiguity in shading. The method is applied to relighting of free-viewpoint rendering from multiple view video capture. This demonstrates relighting with reproduction of fine surface detail. Quantitative evaluation on synthetic models with textured appearance shows accurate estimation of intrinsic surface reflectance properties. © 2014 Springer International Publishing

    Full Body Acting Rehearsal in a Networked Virtual Environment-A Case Study

    Get PDF
    In order to rehearse for a play or a scene from a movie, it is generally required that the actors are physically present at the same time in the same place. In this paper we present an example and experience of a full body motion shared virtual environment (SVE) for rehearsal. The system allows actors and directors to meet in an SVE in order to rehearse scenes for a play or a movie, that is, to perform some dialogue and blocking (positions, movements, and displacements of actors in the scene) rehearsal through a full body interactive virtual reality (VR) system. The system combines immersive VR rendering techniques as well as network capabilities together with full body tracking. Two actors and a director rehearsed from separate locations. One actor and the director were in London (located in separate rooms) while the second actor was in Barcelona. The Barcelona actor used a wide field-of-view head-tracked head-mounted display, and wore a body suit for real-time motion capture and display. The London actor was in a Cave system, with head and partial body tracking. Each actor was presented to the other as an avatar in the shared virtual environment, and the director could see the whole scenario on a desktop display, and intervene by voice commands. A video stream in a window displayed in the virtual environment also represented the director. The London participant was a professional actor, who afterward commented on the utility of the system for acting rehearsal. It was concluded that full body tracking and corresponding real-time display of all the actors' movements would be a critical requirement, and that blocking was possible down to the level of detail of gestures. Details of the implementation, actors, and director experiences are provided
    corecore