We describe a generative model of ‘talking head ’ facial behaviour, intended for use in both video synthesis and model-based interpretation. The model is learnt, without supervision, from talking head video, parameterised by tracking with an Active Appearance Model (AAM). We present a integrated probabilistic framework for capturing both the short-term visual dynamics and longer-term behavioural structure. We demonstrate that the approach leads to a compact model, capable of generating realistic and relatively subtle talking head behaviour in real time. The results of a forcedchoice psychophysical experiment show that the quality of the generated sequences is significantly better than that obtained using alternative approaches, and is indistinguishable from that of the original training sequence.