633 research outputs found

    Background subtraction by combining Temporal and Spatio-Temporal histograms in the presence of camera movement

    Get PDF
    Background subtraction is the classical approach to differentiate moving objects in a scene from the static background when the camera is fixed. If the fixed camera assumption does not hold, a frame registration step is followed by the background subtraction. However, this registration step cannot perfectly compensate camera motion, thus errors like translations of pixels from their true registered position occur. In this paper, we overcome these errors with a simple, but effective background subtraction algorithm that combines Temporal and Spatio-Temporal approaches. The former models the temporal intensity distribution of each individual pixel. The latter classifies foreground and background pixels, taking into account the intensity distribution of each pixels' neighborhood. The experimental results show that our algorithm outperforms the state-of-the-art systems in the presence of jitter, in spite of its simplicity

    Descriptive temporal template features for visual motion recognition

    Get PDF
    In this paper, a human action recognition system is proposed. The system is based on new, descriptive `temporal template' features in order to achieve high-speed recognition in real-time, embedded applications. The limitations of the well known `Motion History Image' (MHI) temporal template are addressed and a new `Motion History Histogram' (MHH) feature is proposed to capture more motion information in the video. MHH not only provides rich motion information, but also remains computationally inexpensive. To further improve classification performance, we combine both MHI and MHH into a low dimensional feature vector which is processed by a support vector machine (SVM). Experimental results show that our new representation can achieve a significant improvement in the performance of human action recognition over existing comparable methods, which use 2D temporal template based representations

    A spatio-temporal learning approach for crowd activity modelling to detect anomalies

    Get PDF
    With security and surveillance gaining paramount importance in recent years, it has become important to reliably automate some surveillance tasks for monitoring crowded areas. The need to automate this process also supports human operators who are overwhelmed with a large number of security screens to monitor. Crowd events like excess usage throughout the day, sudden peaks in crowd volume, chaotic motion (obvious to spot) all emerge over time which requires constant monitoring in order to be informed of the event build up. To ease this task, the computer vision community has been addressing some surveillance tasks using image processing and machine learning techniques. Currently tasks such as crowd density estimation or people counting, crowd detection and abnormal crowd event detection are being addressed. Most of the work has focused on crowd detection and estimation with the focus slowly shifting on crowd event learning for abnormality detection.This thesis addresses crowd abnormality detection. However, by way of the modelling approach used, implicitly, the tasks of crowd detection and estimation are also handled. The existing approaches in the literature have a number of drawbacks that keep them from being scalable for any public scene. Most pieces of work use simple scene settings where motion occurs wholly in the near-field or far-field of the camera view. Thus, with assumptions on the expected location of person motion, small blobs are arbitrarily filtered out as noise when they may be legitimate motion in the far-field. Such an approach makes it difficult to deal with complex scenes where entry/exit points occur in the centre of the scene or multiple pathways running from the near to the far-field of the camera view that produce blobs of differing sizes. Further, most authors assume the number of directions people motion should exhibit rather than discover what these may be. Approaches with such assumptions would result in loss of accuracy while dealing with (say) a railway platform which shows a number of motion directions, namely two-way, one-way, dispersive, etc. Finally, very few contributions of work use time as a video feature to model the human intuitiveness of time-of-day abnormalities. That is certain motion patterns may be abnormal if they have not been seen for a given time of day. Most works use it (time) as an extra qualifier to spatial data for trajectory definition.In this thesis most of these drawbacks have been addressed by dealing with these in the modelling of crowd activity. Firstly, no assumptions are made on scene structure or blob sizes resulting therefrom. The optical flow algorithm used is robust and even the noise presented (which is infact unwanted motion of swaying hands and legs as opposed to that from the torso) is fairly consistent and therefore can be factored into the modelling. Blobs, no matter what the size are not discarded as they may be legitimate emerging motion in the far-field. The modelling also deals with paths extending from the far to the near-field of the camera view and segments these such that each segment contains self-comparable fields of motion. The need for a normalisation factor for comparisons across near and far field motion fields implies prior knowledge of the scene. As the system is intended for generic public locations having varying scene structures, normalisation is not an option in the processing used and yet the near & far-field motion changes are accounted for. Secondly, this thesis describes a system that learns the true distribution of motion along the detected paths and maintains these. The approach is such that doing so does not generalise the direction distributions which would cause loss in precision. No impositions are made on expected motion and if the underlying motion is well defined (one-way or two-way), then this is represented as a well defined distribution and as a mixture of directions if the underlying motion presents itself as so.Finally, time as a video feature is used to allow for activity to re-enforce itself on a daily basis such that motion patterns for a given time and space begin to define themselves through re-enforcement which acts as the model used for abnormality detection in time and space (spatio-temporal). The system has been tested with real-world data datasets with varying fields of camera view. The testing has shown no false negatives, very few false positives and detects crowd abnormalities quite well with respect to the ground truths of the datasets used
    corecore