11 research outputs found
PhD Forum: Investigating the performance of a multi-modal approach to unusual event detection
In this paper, we investigate the parameters under- pinning our previously presented system for detecting unusual events in surveillance applications [1]. The system identifies anomalous events using an unsupervised data-driven approach. During a training period, typical activities within a surveilled environment are modeled using multi-modal sensor readings. Significant deviations from the established model of regular activity can then be flagged as anomalous at run-time. Using this approach, the system can be deployed and automatically adapt for use in any environment without any manual adjustment. Experiments carried out on two days of audio-visual data were performed and evaluated using a manually annotated ground- truth. We investigate sensor fusion and quantitatively evaluate the performance gains over single modality models. We also investigate different formulations of our cluster-based model of usual scenes as well as the impact of dynamic thresholding on identifying anomalous events. Experimental results are promis- ing, even when modeling is performed using very simple audio and visual features
Automatic Workflow Monitoring in Industrial Environments
Robust automatic workflow monitoring using visual sensors in industrial environments is still an unsolved problem. This is mainly due to the difficulties of recording data in work settings and the environmental conditions (large occlusions, similar background/foreground) which do not allow object detection/tracking algorithms to perform robustly. Hence approaches analysing trajectories are limited in such environments. However, workflow monitoring is especially needed due to quality and safety requirements. In this paper we propose a robust approach for workflow classification in industrial environments. The proposed approach consists of a robust scene descriptor and an efficient time series analysis method. Experimental results on a challenging car manufacturing dataset showed that the proposed scene descriptor is able to detect both human and machinery related motion robustly and the used time series analysis method can classify tasks in a given workflow automatically
Temporally Consistent Snow Cover Estimation from Noisy, Irregularly Sampled Measurements
We propose a method for accurate and temporally consistent surface classification in the presence of noisy, irregularly sampled measurements, and apply it to the estimation of snow coverage over time. The input imagery is extremely challenging, with large variations in lighting and weather distorting the measurements. Initial snow cover estimations are obtained using a Gaussian Mixture Model of color. To achieve a temporally consistent snow cover estimation, we use a Markov Random Field that penalizes rapid fluctuations in the snow state, and show that the penalty term needs to be quite large, resulting in slow reactivity to changes. We thus propose a classifier to separate good from uninformative images, which allows to use a smaller penalty term. We show that the incorporation of domain knowledge to discard uninformative images leads to better reactivity to changes in snow coverage as well as more accurate snow cover estimations
Efficient and effective automated surveillance agents using kernel tricks
Many schemes have been presented over the years to develop automated visual surveillance systems. However, these schemes typically need custom equipment, or involve significant complexity and storage requirements. In this paper we present three software-based agents built using kernel machines to perform automated, real-time intruder detection in surveillance systems. Kernel machines provide a powerful data mining technique that may be used for pattern matching in the presence of complex data. They work by first mapping the raw input data onto a (often much) higher dimensional feature space, and then clustering in the feature space instead. The reasoning is that mapping onto the (higher-dimensional) feature space enables the comparison of additional, higher order correlations in determining patterns between the raw data points. The agents proposed here have been built using algorithms that are adaptive, portable, do not require any expensive or sophisticated components, and are lightweight and efficient having run times of the order of hundredths of a second. Through application to real image streams from a simple, run-of-the-mill closed-circuit television surveillance system, and direct quantitative performance comparison with some existing schemes, we show that it is possible to easily obtain high detection accuracy with low computational and storage complexities
Activity understanding and unusual event detection in surveillance videos
PhDComputer scientists have made ceaseless efforts to replicate cognitive video understanding abilities
of human brains onto autonomous vision systems. As video surveillance cameras become
ubiquitous, there is a surge in studies on automated activity understanding and unusual event detection
in surveillance videos. Nevertheless, video content analysis in public scenes remained a
formidable challenge due to intrinsic difficulties such as severe inter-object occlusion in crowded
scene and poor quality of recorded surveillance footage. Moreover, it is nontrivial to achieve
robust detection of unusual events, which are rare, ambiguous, and easily confused with noise.
This thesis proposes solutions for resolving ambiguous visual observations and overcoming unreliability
of conventional activity analysis methods by exploiting multi-camera visual context
and human feedback.
The thesis first demonstrates the importance of learning visual context for establishing reliable
reasoning on observed activity in a camera network. In the proposed approach, a new Cross
Canonical Correlation Analysis (xCCA) is formulated to discover and quantify time delayed pairwise
correlations of regional activities observed within and across multiple camera views. This
thesis shows that learning time delayed pairwise activity correlations offers valuable contextual
information for (1) spatial and temporal topology inference of a camera network, (2) robust person
re-identification, and (3) accurate activity-based video temporal segmentation. Crucially, in
contrast to conventional methods, the proposed approach does not rely on either intra-camera or
inter-camera object tracking; it can thus be applied to low-quality surveillance videos featuring
severe inter-object occlusions.
Second, to detect global unusual event across multiple disjoint cameras, this thesis extends
visual context learning from pairwise relationship to global time delayed dependency between
regional activities. Specifically, a Time Delayed Probabilistic Graphical Model (TD-PGM) is
proposed to model the multi-camera activities and their dependencies. Subtle global unusual
events are detected and localised using the model as context-incoherent patterns across multiple
camera views. In the model, different nodes represent activities in different decomposed re3
gions from different camera views, and the directed links between nodes encoding time delayed
dependencies between activities observed within and across camera views. In order to learn optimised
time delayed dependencies in a TD-PGM, a novel two-stage structure learning approach
is formulated by combining both constraint-based and scored-searching based structure learning
methods.
Third, to cope with visual context changes over time, this two-stage structure learning approach
is extended to permit tractable incremental update of both TD-PGM parameters and its
structure. As opposed to most existing studies that assume static model once learned, the proposed
incremental learning allows a model to adapt itself to reflect the changes in the current
visual context, such as subtle behaviour drift over time or removal/addition of cameras. Importantly,
the incremental structure learning is achieved without either exhaustive search in a large
graph structure space or storing all past observations in memory, making the proposed solution
memory and time efficient.
Forth, an active learning approach is presented to incorporate human feedback for on-line
unusual event detection. Contrary to most existing unsupervised methods that perform passive
mining for unusual events, the proposed approach automatically requests supervision for critical
points to resolve ambiguities of interest, leading to more robust detection of subtle unusual
events. The active learning strategy is formulated as a stream-based solution, i.e. it makes decision
on-the-fly on whether to request label for each unlabelled sample observed in sequence.
It selects adaptively two active learning criteria, namely likelihood criterion and uncertainty criterion
to achieve (1) discovery of unknown event classes and (2) refinement of classification
boundary.
The effectiveness of the proposed approaches is validated using videos captured from busy
public scenes such as underground stations and traffic intersections
Hunting Nessie - real-time abnormality detection from webcams
We present a data-driven, unsupervised method for unusual scene detection from static webcams. Such time-lapse data is usually captured with very low or varying framerate. This precludes the use of tools typically used in surveillance (e.g., object tracking). Hence, our algorithm is based on simple image features. We define usual scenes based on the concept of meaningful nearest neighbours instead of building explicit models. To effectively compare the observations, our algorithm adapts the data representation. Furthermore, we use incremental learning techniques to adapt to changes in the data-stream. Experiments on several months of webcam data show that our approach detects plausible unusual scenes, which have not been observed in the data-stream before. ©2009 IEEE.Breitenstein M.D., Grabner H., Van Gool L., ''Hunting Nessie - real-time abnormality detection from webcams'', 9th IEEE international workshop on visual surveillance - VS2009, held in conjunction with ICCV 2009, October 3, 2009, Kyoto, Japan.status: publishe