Search CORE

18,082 research outputs found

Towards Streaming Speech-to-Avatar Synthesis

Author: Anumanchipalli Gopala K.
Prabhune Tejas S.
Wu Peter
Yu Bohan
Publication venue
Publication date: 24/10/2023
Field of study

Streaming speech-to-avatar synthesis creates real-time animations for a virtual character from audio data. Accurate avatar representations of speech are important for the visualization of sound in linguistics, phonetics, and phonology, visual feedback to assist second language acquisition, and virtual embodiment for paralyzed patients. Previous works have highlighted the capability of deep articulatory inversion to perform high-quality avatar animation using electromagnetic articulography (EMA) features. However, these models focus on offline avatar synthesis with recordings rather than real-time audio, which is necessary for live avatar visualization or embodiment. To address this issue, we propose a method using articulatory inversion for streaming high quality facial and inner-mouth avatar animation from real-time audio. Our approach achieves 130ms average streaming latency for every 0.1 seconds of audio with a 0.792 correlation with ground truth articulations. Finally, we show generated mouth and tongue animations to demonstrate the efficacy of our methodology.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive

Framework development of real-time lip sync Animation on viseme based human speech

Author: Khairul Aidil Azlin Abd. Rahman
Loh Ngiik Hoon
Wang Yin Chai
Publication venue: 'Penerbit UTM Press'
Publication date: 01/01/2015
Field of study

Performance of real-time lip sync animation is an approach to perform a virtual computer generated character talk, which synchronizes an accurate lip movement and sound in live. Based on the review, the creation of lip sync animation in real-time is particularly challenging in mapping the lip animation movement and sounds that are synchronized. The fluidity and accuracy in natural speech are one of the most difficult things to do convincingly in facial animation. People are very sensitive to this when you get it wrong because we are all focused on faces. Especially in real time application, the visual impact needed is immediate, commanding and convincing to the audience. A research on viseme based human speech was conducted to develop a lip synchronization platform in order to achieve an accurate lip motion with the sounds that are synchronized as well as increase the visual performance of the facial animation. Through this research, a usability automated digital speech system for lip sync animation was developed. Automatic designed with the use of simple synchronization tricks which generally improve accuracy and realistic visual impression and implementation of advanced features into lip synchronization application. This study allows simulation of lip synching in real time and offline application. Hence, it can be applied in various areas such as entertainment, education, tutoring, animation and live performances, such as theater, broadcasting, education and live presentation

Unimas Institutional Repository

A Mimetic Strategy to Engage Voluntary Physical Activity In Interactive Entertainment

Author: Lyons Michael J.
Wiratanaya Andreas
Publication venue
Publication date: 04/09/2017
Field of study

We describe the design and implementation of a vision based interactive entertainment system that makes use of both involuntary and voluntary control paradigms. Unintentional input to the system from a potential viewer is used to drive attention-getting output and encourage the transition to voluntary interactive behaviour. The iMime system consists of a character animation engine based on the interaction metaphor of a mime performer that simulates non-verbal communication strategies, without spoken dialogue, to capture and hold the attention of a viewer. The system was developed in the context of a project studying care of dementia sufferers. Care for a dementia sufferer can place unreasonable demands on the time and attentional resources of their caregivers or family members. Our study contributes to the eventual development of a system aimed at providing relief to dementia caregivers, while at the same time serving as a source of pleasant interactive entertainment for viewers. The work reported here is also aimed at a more general study of the design of interactive entertainment systems involving a mixture of voluntary and involuntary control.Comment: 6 pages, 7 figures, ECAG08 worksho

arXiv.org e-Print Archive

FigShare

Investigating facial animation production through artistic inquiry

Author: Cook Malcolm
Robinson Brian
Sloan Robin J. S.
Publication venue
Publication date: 01/11/2009
Field of study

Studies into dynamic facial expressions tend to make use of experimental methods based on objectively manipulated stimuli. New techniques for displaying increasingly realistic facial movement and methods of measuring observer responses are typical of computer animation and psychology facial expression research. However, few projects focus on the artistic nature of performance production. Instead, most concentrate on the naturalistic appearance of posed or acted expressions. In this paper, the authors discuss a method for exploring the creative process of emotional facial expression animation, and ask whether anything can be learned about authentic dynamic expressions through artistic inquiry

Abertay Research Portal

Expressive characters and a text chat interface

Author: Ballin D
Crabtree IB
Gillies M
Publication venue
Publication date: 01/01/2004
Field of study

UCL Discovery

Piavca: a framework for heterogeneous interactions with virtual characters

Author: A Guye-Vuilléme
A Kendon
FS Grassia
J Lee
L Kovar
M Alexa
M Argyle
Marco Gillies
Mel Slater
O Arikan
S Kopp
S Kopp
SP Lee
V Vinayagamoorthy
Xueni Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

This paper presents a virtual character animation system for real time multimodal interaction in an immersive virtual reality setting. Human to human interaction is highly multimodal, involving features such as verbal language, tone of voice, facial expression, gestures and gaze. This multimodality means that, in order to simulate social interaction, our characters must be able to handle many different types of interaction, and many different types of animation, simultaneously. Our system is based on a model of animation that represents different types of animations as instantiations of an abstract function representation. This makes it easy to combine different types of animation. It also encourages the creation of behavior out of basic building blocks. making it easy to create and configure new beahviors for novel situations. The model has been implemented in Piavca, an open source character animation system

Goldsmiths Research Online

Crossref

UCL Discovery

Capture, Learning, and Synthesis of 3D Speaking Styles

Author: Black Michael J.
Bolkart Timo
Cudeiro Daniel
Laidlaw Cassidy
Ranjan Anurag
Publication venue
Publication date: 01/01/2019
Field of study

Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input - even speech in languages other than English - and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes at http://voca.is.tue.mpg.de.Comment: To appear in CVPR 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

A motion system for social and animated robots

Author: goris Kristof
lefeber Dirk
Saldien Jelle
vandamme Michael
vanderborght Bram
Publication venue: 'IntechOpen'
Publication date: 01/01/2014
Field of study

This paper presents an innovative motion system that is used to control the motions and animations of a social robot. The social robot Probo is used to study Human-Robot Interactions (HRI), with a special focus on Robot Assisted Therapy (RAT). When used for therapy it is important that a social robot is able to create an "illusion of life" so as to become a believable character that can communicate with humans. The design of the motion system in this paper is based on insights from the animation industry. It combines operator-controlled animations with low-level autonomous reactions such as attention and emotional state. The motion system has a Combination Engine, which combines motion commands that are triggered by a human operator with motions that originate from different units of the cognitive control architecture of the robot. This results in an interactive robot that seems alive and has a certain degree of "likeability". The Godspeed Questionnaire Series is used to evaluate the animacy and likeability of the robot in China, Romania and Belgium

Ghent University Academic Bibliography

Directory of Open Access Journals