583 research outputs found

    Two-stage sparse representation based abnormal crowd event detection in videos

    Get PDF
    Ubiquitous surveillance has become part of our lives to increase security and safety. Despite the wide application of surveillance systems, their efficiency is limited by human factors, such as boredom and fatigue; because most of the time, nothing unusual happens. In safety-critical applications, time is essential and it is vital to act fast to prevent costly incidents. This thesis proposes a two-stage abnormal crowd event detection framework based on k-means clustering in the first stage, and sparse representation based methods in the second stage, to alleviate the laborious task of video monitoring. We conduct a literature review of 18 studies, where we specifically focus on sparse representation based methods. Accordingly, we choose the spatio-temporal gradient feature due to its simplicity, efficiency, and effectiveness in motion representation. After extracting features only from normal events, k-means clustering is applied to separate different motion feature clusters. Then, clusters with smaller samples, which are deemed to contain mostly abnormal features, are removed according to a threshold. In the second stage, we learn a dictionary for each remaining cluster using the approximate K-SVD algorithm. In testing, the reconstruction error of a feature against a learned dictionary and its sparse representation is used to determine an abnormality. We conduct extensive experiments on a standard dataset to evaluate the detection performance of the method. Furthermore, the effect of hyper-parameters in our method is investigated. We also compare our method with different methods to examine its effectiveness. Results indicate that our abnormal event detection framework can successfully understand abnormal events in a scene while running in real-time at 161 frames per second. With a few exceptions, no significant advantage of the two-stage sparse representation approach over a single large dictionary was found. We speculate that these results may be influenced by a small sample size. Nevertheless, our approach, due to its unsupervised nature, can be adapted to different contexts without additional annotation effort and using only normal events from videos. Therefore it motivates us for further development

    Discriminative Dictionary Learning with Motion Weber Local Descriptor for Violence Detection

    Full text link
    © 1991-2012 IEEE. Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in developing an algorithm that can detect violence in surveillance videos with high performance. In this paper, following our recently proposed idea of motion Weber local descriptor (WLD), we make two major improvements and propose a more effective and efficient algorithm for detecting violence from motion images. First, we propose an improved WLD (IWLD) to better depict low-level image appearance information, and then extend the spatial descriptor IWLD by adding a temporal component to capture local motion information and hence form the motion IWLD. Second, we propose a modified sparse-representation-based classification model to both control the reconstruction error of coding coefficients and minimize the classification error. Based on the proposed sparse model, a class-specific dictionary containing dictionary atoms corresponding to the class labels is learned using class labels of training samples. With this learned dictionary, not only the representation residual but also the representation coefficients become discriminative. A classification scheme integrating the modified sparse model is developed to exploit such discriminative information. The experimental results on three benchmark data sets have demonstrated the superior performance of the proposed approach over the state of the arts

    Abnormal behavior detection using sparse representations through sequential generalization of k-means

    Get PDF
    The potential capability to automatically detect and classify human behavior as either normal or abnormal events is an important aspect in intelligent monitoring/surveillance systems. This study presents a new high-performance framework for detecting behavioral abnormalities in video streams by utilizing only the patterns for normal behaviors. In this paper, we used a hybrid descriptor, called a foreground optical flow energy (FGOFE), which makes use of two effective motion techniques in order to extract the most descriptive spatiotemporal features in video sequences. The FGOFE descriptor can effectively capture both weak and sudden incidents in a scene. The sequential generalization of k-means (SGK) algorithm was applied in this study to generate the dictionary set that can sparsely represent each signal; in addition, the orthogonal matching pursuit algorithm was utilized to recover high-dimensional sparse features when referring to a few numbers of noisy linear measurements. Using the SGK allows gaining a less complex and quicker implementation compared to other dictionary learning methods. We conducted comprehensive experiments to analyze and evaluate the ability of our framework in detecting abnormalities using several public benchmarks, which contain different abnormal samples and various contextual compositions. The experimental results show that the proposed framework achieved high detection accuracy (up to 95.33%) and low frame processing time (31 ms on average) compared to the relevant related work
    • …
    corecore