1 research outputs found
Neural Face Models for Example-Based Visual Speech Synthesis
Creating realistic animations of human faces with computer graphic models is
still a challenging task. It is often solved either with tedious manual work or
motion capture based techniques that require specialised and costly hardware.
Example based animation approaches circumvent these problems by re-using
captured data of real people. This data is split into short motion samples that
can be looped or concatenated in order to create novel motion sequences. The
obvious advantages of this approach are the simplicity of use and the high
realism, since the data exhibits only real deformations. Rather than tuning
weights of a complex face rig, the animation task is performed on a higher
level by arranging typical motion samples in a way such that the desired facial
performance is achieved. Two difficulties with example based approaches,
however, are high memory requirements as well as the creation of artefact-free
and realistic transitions between motion samples. We solve these problems by
combining the realism and simplicity of example-based animations with the
advantages of neural face models. Our neural face model is capable of
synthesising high quality 3D face geometry and texture according to a compact
latent parameter vector. This latent representation reduces memory requirements
by a factor of 100 and helps creating seamless transitions between concatenated
motion samples. In this paper, we present a marker-less approach for facial
motion capture based on multi-view video. Based on the captured data, we learn
a neural representation of facial expressions, which is used to seamlessly
concatenate facial performances during the animation procedure. We demonstrate
the effectiveness of our approach by synthesising mouthings for Swiss-German
sign language based on viseme query sequences