The main objective of this thesis was the synthesis of speech synchronised motion, in
particular head motion. The hypothesis that head motion can be estimated from the
speech signal was confirmed. In order to achieve satisfactory results, a motion capture
data base was recorded, a definition of head motion in terms of articulation was discovered,
a continuous stream mapping procedure was developed, and finally the synthesis
was evaluated. Based on previous research into non-verbal behaviour basic types of
head motion were invented that could function as modelling units. The stream mapping
method investigated in this thesis is based on Hidden Markov Models (HMMs), which
employ modelling units to map between continuous signals. The objective evaluation
of the modelling parameters confirmed that head motion types could be predicted from
the speech signal with an accuracy above chance, close to 70%. Furthermore, a special
type ofHMMcalled trajectoryHMMwas used because it enables synthesis of continuous
output. However head motion is a stochastic process therefore the trajectory HMM
was further extended to allow for non-deterministic output. Finally the resulting head
motion synthesis was perceptually evaluated. The effects of the “uncanny valley” were
also considered in the evaluation, confirming that rendering quality has an influence on
our judgement of movement of virtual characters. In conclusion a general method for
synthesising speech-synchronised behaviour was invented that can applied to a whole
range of behaviours