907 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Dressing Avatars: Deep Photorealistic Appearance for Physically Simulated Clothing
Despite recent progress in developing animatable full-body avatars, realistic
modeling of clothing - one of the core aspects of human self-expression -
remains an open challenge. State-of-the-art physical simulation methods can
generate realistically behaving clothing geometry at interactive rates.
Modeling photorealistic appearance, however, usually requires physically-based
rendering which is too expensive for interactive applications. On the other
hand, data-driven deep appearance models are capable of efficiently producing
realistic appearance, but struggle at synthesizing geometry of highly dynamic
clothing and handling challenging body-clothing configurations. To this end, we
introduce pose-driven avatars with explicit modeling of clothing that exhibit
both photorealistic appearance learned from real-world data and realistic
clothing dynamics. The key idea is to introduce a neural clothing appearance
model that operates on top of explicit geometry: at training time we use
high-fidelity tracking, whereas at animation time we rely on physically
simulated geometry. Our core contribution is a physically-inspired appearance
network, capable of generating photorealistic appearance with view-dependent
and dynamic shadowing effects even for unseen body-clothing configurations. We
conduct a thorough evaluation of our model and demonstrate diverse animation
results on several subjects and different types of clothing. Unlike previous
work on photorealistic full-body avatars, our approach can produce much richer
dynamics and more realistic deformations even for many examples of loose
clothing. We also demonstrate that our formulation naturally allows clothing to
be used with avatars of different people while staying fully animatable, thus
enabling, for the first time, photorealistic avatars with novel clothing.Comment: SIGGRAPH Asia 2022 (ACM ToG) camera ready. The supplementary video
can be found on
https://research.facebook.com/publications/dressing-avatars-deep-photorealistic-appearance-for-physically-simulated-clothing
Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
We present a method that enables synthesizing novel views and novel poses of
arbitrary human performers from sparse multi-view images. A key ingredient of
our method is a hybrid appearance blending module that combines the advantages
of the implicit body NeRF representation and image-based rendering. Existing
generalizable human NeRF methods that are conditioned on the body model have
shown robustness against the geometric variation of arbitrary human performers.
Yet they often exhibit blurry results when generalized onto unseen identities.
Meanwhile, image-based rendering shows high-quality results when sufficient
observations are available, whereas it suffers artifacts in sparse-view
settings. We propose Neural Image-based Avatars (NIA) that exploits the best of
those two methods: to maintain robustness under new articulations and
self-occlusions while directly leveraging the available (sparse) source view
colors to preserve appearance details of new subject identities. Our hybrid
design outperforms recent methods on both in-domain identity generalization as
well as challenging cross-dataset generalization settings. Also, in terms of
the pose generalization, our method outperforms even the per-subject optimized
animatable NeRF methods. The video results are available at
https://youngjoongunc.github.io/ni
TADA! Text to Animatable Digital Avatars
We introduce TADA, a simple-yet-effective approach that takes textual
descriptions and produces expressive 3D avatars with high-quality geometry and
lifelike textures, that can be animated and rendered with traditional graphics
pipelines. Existing text-based character generation methods are limited in
terms of geometry and texture quality, and cannot be realistically animated due
to inconsistent alignment between the geometry and the texture, particularly in
the face region. To overcome these limitations, TADA leverages the synergy of a
2D diffusion model and an animatable parametric body model. Specifically, we
derive an optimizable high-resolution body model from SMPL-X with 3D
displacements and a texture map, and use hierarchical rendering with score
distillation sampling (SDS) to create high-quality, detailed, holistic 3D
avatars from text. To ensure alignment between the geometry and texture, we
render normals and RGB images of the generated character and exploit their
latent embeddings in the SDS training process. We further introduce various
expression parameters to deform the generated character during training,
ensuring that the semantics of our generated character remain consistent with
the original SMPL-X model, resulting in an animatable character. Comprehensive
evaluations demonstrate that TADA significantly surpasses existing approaches
on both qualitative and quantitative measures. TADA enables creation of
large-scale digital character assets that are ready for animation and
rendering, while also being easily editable through natural language. The code
will be public for research purposes
- …