This thesis presents a novel method for performing multi-agent behaviour recognition
without requiring large training corpora. The reduced need for data means that robust
probabilistic recognition can be performed within domains where annotated datasets are
traditionally unavailable (e.g. surveillance, defence). Human behaviours are composed
from sequences of underlying activities that can be used as salient features. We do not
assume that the exact temporal ordering of such features is necessary, so can represent
behaviours using an unordered “bag-of-features”. A weak temporal ordering is imposed
during inference to match behaviours to observations and replaces the learnt model parameters
used by competing methods. Our three-tier architecture comprises low-level video
tracking, event analysis and high-level inference. High-level inference is performed using
a new, cascading extension of the Rao-Blackwellised Particle Filter. Behaviours are
recognised at multiple levels of abstraction and can contain a mixture of solo and multiagent
behaviour. We validate our framework using the PETS 2006 video surveillance
dataset and our own video sequences, in addition to a large corpus of simulated data.
We achieve a mean recognition precision of 96.4% on the simulated data and 89.3% on
the combined video data. Our “bag-of-features” framework is able to detect when behaviours
terminate and accurately explains agent behaviour despite significant quantities
of low-level classification errors in the input, and can even detect agents who change their
behaviour