Speech-driven facial motion synthesis is a well explored research topic. However, little has been done to model expressive visual behavior during speech. We address this issue using a machine learning approach that relies on a database of speech related high-fidelity facial motions. From this training set, we derive a generative model of expressive facial motion that incorporates emotion control while maintaining accurate lip-synching. The emotional content of the input speech can be manually specified by the user or automatically extracted from the audio signal using a Support Vector Machine classifier
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.