We discuss a new approach for driving avatars using synthetic speech generated from pure text. Lip and face muscles are controlled by the information embedded in the utterance and its related expressiveness. Rule-based, text-to-speech synthesis is used to generate phonetic and expression transcriptions of the text to be uttered by the avatar. Two artificial neural networks, one for text-to-phone transcription and the other for phone-to-viseme mapping have been trained from phonetic transcription data. Two fuzzy-logic
engines were tuned for smoothed control of lip and face movement.
Simulations have been run to test neural-fuzzy controls using a parametric speech synthesizer to generate voices and a face
synthesizer to generate facial movement. Experimental results show that soft computing affords a good solution for the smoothed control of avatars during the expressive utterance of text