19,068 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
Capture, Learning, and Synthesis of 3D Speaking Styles
Audio-driven 3D facial animation has been widely explored, but achieving
realistic, human-like performance is still unsolved. This is due to the lack of
available 3D datasets, models, and standard evaluation metrics. To address
this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans
captured at 60 fps and synchronized audio from 12 speakers. We then train a
neural network on our dataset that factors identity from facial motion. The
learned model, VOCA (Voice Operated Character Animation) takes any speech
signal as input - even speech in languages other than English - and
realistically animates a wide range of adult faces. Conditioning on subject
labels during training allows the model to learn a variety of realistic
speaking styles. VOCA also provides animator controls to alter speaking style,
identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball
rotations) during animation. To our knowledge, VOCA is the only realistic 3D
facial animation model that is readily applicable to unseen subjects without
retargeting. This makes VOCA suitable for tasks like in-game video, virtual
reality avatars, or any scenario in which the speaker, speech, or language is
not known in advance. We make the dataset and model available for research
purposes at http://voca.is.tue.mpg.de.Comment: To appear in CVPR 201
Effects of non-pharmacological or pharmacological interventions on cognition and brain plasticity of aging individuals.
Brain aging and aging-related neurodegenerative disorders are major health challenges faced by modern societies. Brain aging is associated with cognitive and functional decline and represents the favourable background for the onset and development of dementia. Brain aging is associated with early and subtle anatomo-functional physiological changes that often precede the appearance of clinical signs of cognitive decline. Neuroimaging approaches unveiled the functional correlates of these alterations and helped in the identification of therapeutic targets that can be potentially useful in counteracting age-dependent cognitive decline. A growing body of evidence supports the notion that cognitive stimulation and aerobic training can preserve and enhance operational skills in elderly individuals as well as reduce the incidence of dementia. This review aims at providing an extensive and critical overview of the most recent data that support the efficacy of non-pharmacological and pharmacological interventions aimed at enhancing cognition and brain plasticity in healthy elderly individuals as well as delaying the cognitive decline associated with dementia
Beyond Reality: The Pivotal Role of Generative AI in the Metaverse
Imagine stepping into a virtual world that's as rich, dynamic, and
interactive as our physical one. This is the promise of the Metaverse, and it's
being brought to life by the transformative power of Generative Artificial
Intelligence (AI). This paper offers a comprehensive exploration of how
generative AI technologies are shaping the Metaverse, transforming it into a
dynamic, immersive, and interactive virtual world. We delve into the
applications of text generation models like ChatGPT and GPT-3, which are
enhancing conversational interfaces with AI-generated characters. We explore
the role of image generation models such as DALL-E and MidJourney in creating
visually stunning and diverse content. We also examine the potential of 3D
model generation technologies like Point-E and Lumirithmic in creating
realistic virtual objects that enrich the Metaverse experience. But the journey
doesn't stop there. We also address the challenges and ethical considerations
of implementing these technologies in the Metaverse, offering insights into the
balance between user control and AI automation. This paper is not just a study,
but a guide to the future of the Metaverse, offering readers a roadmap to
harnessing the power of generative AI in creating immersive virtual worlds.Comment: 8 pages, 4 figure
Integrating Technology With Student-Centered Learning
Reviews research on technology's role in personalizing learning, its integration into curriculum-based and school- or district-wide initiatives, and the potential of emerging digital technologies to expand student-centered learning. Outlines implications
Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries
Current `dry lab' surgical phantom simulators are a valuable tool for
surgeons which allows them to improve their dexterity and skill with surgical
instruments. These phantoms mimic the haptic and shape of organs of interest,
but lack a realistic visual appearance. In this work, we present an innovative
application in which representations learned from real intraoperative
endoscopic sequences are transferred to a surgical phantom scenario. The term
hyperrealism is introduced in this field, which we regard as a novel subform of
surgical augmented reality for approaches that involve real-time object
transfigurations. For related tasks in the computer vision community, unpaired
cycle-consistent Generative Adversarial Networks (GANs) have shown excellent
results on still RGB images. Though, application of this approach to continuous
video frames can result in flickering, which turned out to be especially
prominent for this application. Therefore, we propose an extension of
cycle-consistent GANs, named tempCycleGAN, to improve temporal consistency.The
novel method is evaluated on captures of a silicone phantom for training
endoscopic reconstructive mitral valve procedures. Synthesized videos show
highly realistic results with regard to 1) replacement of the silicone
appearance of the phantom valve by intraoperative tissue texture, while 2)
explicitly keeping crucial features in the scene, such as instruments, sutures
and prostheses. Compared to the original CycleGAN approach, tempCycleGAN
efficiently removes flickering between frames. The overall approach is expected
to change the future design of surgical training simulators since the generated
sequences clearly demonstrate the feasibility to enable a considerably more
realistic training experience for minimally-invasive procedures.Comment: 8 pages, accepted at MICCAI 2018, supplemental material at
https://youtu.be/qugAYpK-Z4
A Revolution of Personalized Healthcare: Enabling Human Digital Twin with Mobile AIGC
Mobile Artificial Intelligence-Generated Content (AIGC) technology refers to
the adoption of AI algorithms deployed at mobile edge networks to automate the
information creation process while fulfilling the requirements of end users.
Mobile AIGC has recently attracted phenomenal attentions and can be a key
enabling technology for an emerging application, called human digital twin
(HDT). HDT empowered by the mobile AIGC is expected to revolutionize the
personalized healthcare by generating rare disease data, modeling high-fidelity
digital twin, building versatile testbeds, and providing 24/7 customized
medical services. To promote the development of this new breed of paradigm, in
this article, we propose a system architecture of mobile AIGC-driven HDT and
highlight the corresponding design requirements and challenges. Moreover, we
illustrate two use cases, i.e., mobile AIGC-driven HDT in customized surgery
planning and personalized medication. In addition, we conduct an experimental
study to prove the effectiveness of the proposed mobile AIGC-driven HDT
solution, which shows a particular application in a virtual physical therapy
teaching platform. Finally, we conclude this article by briefly discussing
several open issues and future directions
- …