1 research outputs found

    Multi-modal aggression detection in trains

    No full text
    In many public places multiple sensing devices, such as cameras, are installed to help prevent unwanted situations such as aggression and violence. At the moment, the best solution to reach a safe environment requires human operators to monitor the camera images and take appropriate actions when necessary. In the wake of the terrorist attacks of September 11 2001, there has been a rapid growth in the volume of security cameras and other sensing devices for anti-terrorism and other security purposes. The increased application of these, often multi-modal, sensors has caused a digital data explosion that human operators have difficulty to keep up with. The need for a fully or partially automated system becomes all the more prevailing. The main aim of this thesis is to report on our work to address the complex challenges that arise within the context of multi-modal automatic surveillance applications. In this thesis work, a multi-modal aggression detection system was built that fuses audio and video data from sensors located in a train compartment. Compared to previous work, we adopt a more human centered approach to the detection problem by extracting knowledge and rules from security experts. The aggression detection system is based on many hours of observing and studying professional operators at work as they analyze and respond on surveillance data. Our aggression detection approach is essentially divided into two models: (1) the observation model which describes how low level features from observations are combined into high level concepts and (2) the reasoning model in which high level concepts are reasoned with in order to infer the presence of aggression. In the observation model, feature extraction algorithms are used to transform audio and video signals into features, which are combined by classification algorithms into high level concepts. In the thesis, an analysis is made of the train compartment in particular, on the objects and situations that may be encountered in the train compartment. This analysis is formalized in a train aggression ontology. In addition an overview of relevant audio and video feature extraction and classification algorithms is given. Also the JDL model is introduced as a way to structure the wide range of available algorithms. In the reasoning model knowledge of the human expert and high level reasoning is used to infer the presence of aggression. In essence this boils down to combining the results of the observation model to a description of the current scenario, and comparing this to known scenarios. If the current scenario is similar to a known unwanted scenario or if the current scenario deviates too much from a known normal scenario, an alarm situation may be announced. There are a number of different approaches to accomplish the inference. In this thesis, three different inference methods are explored for their merits in aggression detection: expert system based reasoning, Bayesian reasoning and self organization/emergent reasoning. To test and verify the results, several experimentswere conducted in a real train. During the experiments, actors had to perform scenarios as described in storyboards. The storyboards where previously validated by security experts for their realism. As the actors performed the scenarios data was captured using multiple cameras and microphones. The acquired data was annotated using the vocabulary from the train aggression ontology and used as ground truths for the evaluation of the aggression detection system.MediamaticsElectrical Engineering, Mathematics and Computer Scienc
    corecore