320 research outputs found
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars
Synthesizing high-fidelity head avatars is a central problem for computer
vision and graphics. While head avatar synthesis algorithms have advanced
rapidly, the best ones still face great obstacles in real-world scenarios. One
of the vital causes is inadequate datasets -- 1) current public datasets can
only support researchers to explore high-fidelity head avatars in one or two
task directions; 2) these datasets usually contain digital head assets with
limited data volume, and narrow distribution over different attributes. In this
paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive
advance in head avatar research. It contains massive data assets, with 243+
million complete head frames, and over 800k video sequences from 500 different
identities captured by synchronized multi-view cameras at 30 FPS. It is a
large-scale digital library for head avatars with three key attributes: 1) High
Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K
cameras in 360 degrees. 2) High Diversity: The collected subjects vary from
different ages, eras, ethnicities, and cultures, providing abundant materials
with distinctive styles in appearance and geometry. Moreover, each subject is
asked to perform various motions, such as expressions and head rotations, which
further extend the richness of assets. 3) Rich Annotations: we provide
annotations with different granularities: cameras' parameters, matting, scan,
2D/3D facial landmarks, FLAME fitting, and text description.
Based on the dataset, we build a comprehensive benchmark for head avatar
research, with 16 state-of-the-art methods performed on five main tasks: novel
view synthesis, novel expression synthesis, hair rendering, hair editing, and
talking head generation. Our experiments uncover the strengths and weaknesses
of current methods. RenderMe-360 opens the door for future exploration in head
avatars.Comment: Technical Report; Project Page: 36; Github Link:
https://github.com/RenderMe-360/RenderMe-36
Statistical Modeling of Craniofacial Shape and Texture
We present a fully-automatic statistical 3D shape modeling approach and apply it to a large dataset of 3D images, the Headspace dataset, thus generating the first public shape-and-texture 3D Morphable Model (3DMM) of the full human head. Our approach is the first to employ a template that adapts to the dataset subject before dense morphing. This is fully automatic and achieved using 2D facial landmarking, projection to 3D shape, and mesh editing. In dense template morphing, we improve on the well-known Coherent Point Drift algorithm, by incorporating iterative data-sampling and alignment. Our evaluations demonstrate that our method has better performance in correspondence accuracy and modeling ability when compared with other competing algorithms. We propose a texture map refinement scheme to build high quality texture maps and texture model. We present several applications that include the first clinical use of craniofacial 3DMMs in the assessment of different types of surgical intervention applied to a craniosynostosis patient group
Tex2Shape: Detailed Full Human Body Geometry From a Single Image
We present a simple yet effective method to infer detailed full human body shape from only a single photograph. Our model can infer full-body shape including face, hair, and clothing including wrinkles at interactive frame-rates. Results feature details even on parts that are occluded in the input image. Our main idea is to turn shape regression into an aligned image-to-image translation problem. The input to our method is a partial texture map of the visible region obtained from off-the-shelf methods. From a partial texture, we estimate detailed normal and vector displacement maps, which can be applied to a low-resolution smooth body model to add detail and clothing. Despite being trained purely with synthetic data, our model generalizes well to real-world photographs. Numerous results demonstrate the versatility and robustness of our method
Tex2Shape: Detailed Full Human Body Geometry From a Single Image
We present a simple yet effective method to infer detailed full human body
shape from only a single photograph. Our model can infer full-body shape
including face, hair, and clothing including wrinkles at interactive
frame-rates. Results feature details even on parts that are occluded in the
input image. Our main idea is to turn shape regression into an aligned
image-to-image translation problem. The input to our method is a partial
texture map of the visible region obtained from off-the-shelf methods. From a
partial texture, we estimate detailed normal and vector displacement maps,
which can be applied to a low-resolution smooth body model to add detail and
clothing. Despite being trained purely with synthetic data, our model
generalizes well to real-world photographs. Numerous results demonstrate the
versatility and robustness of our method
CGOF++: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields
Capitalizing on the recent advances in image generation models, existing
controllable face image synthesis methods are able to generate high-fidelity
images with some levels of controllability, e.g., controlling the shapes,
expressions, textures, and poses of the generated face images. However,
previous methods focus on controllable 2D image generative models, which are
prone to producing inconsistent face images under large expression and pose
changes. In this paper, we propose a new NeRF-based conditional 3D face
synthesis framework, which enables 3D controllability over the generated face
images by imposing explicit 3D conditions from 3D face priors. At its core is a
conditional Generative Occupancy Field (cGOF++) that effectively enforces the
shape of the generated face to conform to a given 3D Morphable Model (3DMM)
mesh, built on top of EG3D [1], a recent tri-plane-based generative model. To
achieve accurate control over fine-grained 3D face shapes of the synthesized
images, we additionally incorporate a 3D landmark loss as well as a volume
warping loss into our synthesis framework. Experiments validate the
effectiveness of the proposed method, which is able to generate high-fidelity
face images and shows more precise 3D controllability than state-of-the-art
2D-based controllable face synthesis methods.Comment: This article is an extension of the NeurIPS'22 paper arXiv:2206.0836
- …