485 research outputs found
Relightable Neural Human Assets from Multi-view Gradient Illuminations
Human modeling and relighting are two fundamental problems in computer vision
and graphics, where high-quality datasets can largely facilitate related
research. However, most existing human datasets only provide multi-view human
images captured under the same illumination. Although valuable for modeling
tasks, they are not readily used in relighting problems. To promote research in
both fields, in this paper, we present UltraStage, a new 3D human dataset that
contains more than 2,000 high-quality human assets captured under both
multi-view and multi-illumination settings. Specifically, for each example, we
provide 32 surrounding views illuminated with one white light and two gradient
illuminations. In addition to regular multi-view images, gradient illuminations
help recover detailed surface normal and spatially-varying material maps,
enabling various relighting applications. Inspired by recent advances in neural
representation, we further interpret each example into a neural human asset
which allows novel view synthesis under arbitrary lighting conditions. We show
our neural human assets can achieve extremely high capture performance and are
capable of representing fine details such as facial wrinkles and cloth folds.
We also validate UltraStage in single image relighting tasks, training neural
networks with virtual relighted data from neural assets and demonstrating
realistic rendering improvements over prior arts. UltraStage will be publicly
available to the community to stimulate significant future developments in
various human modeling and rendering tasks. The dataset is available at
https://miaoing.github.io/RNHA.Comment: Project page: https://miaoing.github.io/RNH
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular
2D image observations is a long-standing and actively researched area of
computer vision and graphics. It is an ill-posed inverse problem,
since--without additional prior assumptions--it permits infinitely many
solutions leading to accurate projection to the input 2D images. Non-rigid
reconstruction is a foundational building block for downstream applications
like robotics, AR/VR, or visual content creation. The key advantage of using
monocular cameras is their omnipresence and availability to the end users as
well as their ease of use compared to more sophisticated camera set-ups such as
stereo or multi-view systems. This survey focuses on state-of-the-art methods
for dense non-rigid 3D reconstruction of various deformable objects and
composite scenes from monocular videos or sets of monocular views. It reviews
the fundamentals of 3D reconstruction and deformation modeling from 2D image
observations. We then start from general methods--that handle arbitrary scenes
and make only a few prior assumptions--and proceed towards techniques making
stronger assumptions about the observed objects and types of deformations (e.g.
human faces, bodies, hands, and animals). A significant part of this STAR is
also devoted to classification and a high-level comparison of the methods, as
well as an overview of the datasets for training and evaluation of the
discussed techniques. We conclude by discussing open challenges in the field
and the social aspects associated with the usage of the reviewed methods.Comment: 25 page
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular2D image observations is a long-standing and actively researched area ofcomputer vision and graphics. It is an ill-posed inverse problem,since--without additional prior assumptions--it permits infinitely manysolutions leading to accurate projection to the input 2D images. Non-rigidreconstruction is a foundational building block for downstream applicationslike robotics, AR/VR, or visual content creation. The key advantage of usingmonocular cameras is their omnipresence and availability to the end users aswell as their ease of use compared to more sophisticated camera set-ups such asstereo or multi-view systems. This survey focuses on state-of-the-art methodsfor dense non-rigid 3D reconstruction of various deformable objects andcomposite scenes from monocular videos or sets of monocular views. It reviewsthe fundamentals of 3D reconstruction and deformation modeling from 2D imageobservations. We then start from general methods--that handle arbitrary scenesand make only a few prior assumptions--and proceed towards techniques makingstronger assumptions about the observed objects and types of deformations (e.g.human faces, bodies, hands, and animals). A significant part of this STAR isalso devoted to classification and a high-level comparison of the methods, aswell as an overview of the datasets for training and evaluation of thediscussed techniques. We conclude by discussing open challenges in the fieldand the social aspects associated with the usage of the reviewed methods.<br
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
{3D} Morphable Face Models -- Past, Present and Future
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications
Robust 3D face capture using example-based photometric stereo
We show that using example-based photometric stereo, it is possible to achieve realistic reconstructions of the human face. The method can handle non-Lambertian reflectance and attached shadows after a simple calibration step. We use spherical harmonics to model and de-noise the illumination functions from images of a reference object with known shape, and a fast grid technique to invert those functions and recover the surface normal for each point of the target object. The depth coordinate is obtained by weighted multi-scale integration of these normals, using an integration weight mask obtained automatically from the images themselves. We have applied these techniques to improve the PHOTOFACE system of Hansen et al. (2010). © 2013 Elsevier B.V. All rights reserved
From scans to models: Registration of 3D human shapes exploiting texture information
New scanning technologies are increasing the importance of 3D mesh data, and of algorithms that can reliably register meshes obtained from multiple scans. Surface registration is important e.g. for building full 3D models from partial scans, identifying and tracking objects in a 3D scene, creating statistical shape models.
Human body registration is particularly important for many applications, ranging from biomedicine and robotics to the production of movies and video games; but obtaining accurate and reliable registrations is challenging, given the articulated, non-rigidly deformable structure of the human body.
In this thesis, we tackle the problem of 3D human body registration. We start by analyzing the current state of the art, and find that: a) most registration techniques rely only on geometric information, which is ambiguous on flat surface areas; b) there is a lack of adequate datasets and benchmarks in the field. We address both issues.
Our contribution is threefold. First, we present a model-based registration technique for human meshes that combines geometry and surface texture information to provide highly accurate mesh-to-mesh correspondences. Our approach estimates scene lighting and surface albedo, and uses the albedo to construct a high-resolution textured 3D body model that is brought into registration with multi-camera image data using a robust matching term.
Second, by leveraging our technique, we present FAUST (Fine Alignment Using Scan Texture), a novel dataset collecting 300 high-resolution scans of 10 people in a wide range of poses. FAUST is the first dataset providing both real scans and automatically computed, reliable ground-truth correspondences between them.
Third, we explore possible uses of our approach in dermatology. By combining our registration technique with a melanocytic lesion segmentation algorithm, we propose a system that automatically detects new or evolving lesions over almost the entire body surface, thus helping dermatologists identify potential melanomas.
We conclude this thesis investigating the benefits of using texture information to establish frame-to-frame correspondences in dynamic monocular sequences captured with consumer depth cameras. We outline a novel approach to reconstruct realistic body shape and appearance models from dynamic human performances, and show preliminary results on challenging sequences captured with a Kinect
Inverse rendering techniques for physically grounded image editing
From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people.
This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images
- …