Computationally weak systems and demanding graphical applications are still
mostly dependent on linear blendshapes for facial animations. The accompanying
artifacts such as self-intersections, loss of volume, or missing soft tissue
elasticity can be avoided by using physics-based animation models. However,
these are cumbersome to implement and require immense computational effort. We
propose neural volumetric blendshapes, an approach that combines the advantages
of physics-based simulations with realtime runtimes even on consumer-grade
CPUs. To this end, we present a neural network that efficiently approximates
the involved volumetric simulations and generalizes across human identities as
well as facial expressions. Our approach can be used on top of any linear
blendshape system and, hence, can be deployed straightforwardly. Furthermore,
it only requires a single neutral face mesh as input in the minimal setting.
Along with the design of the network, we introduce a pipeline for the
challenging creation of anatomically and physically plausible training data.
Part of the pipeline is a novel hybrid regressor that densely positions a skull
within a skin surface while avoiding intersections. The fidelity of all parts
of the data generation pipeline as well as the accuracy and efficiency of the
network are evaluated in this work. Upon publication, the trained models and
associated code will be released