8 research outputs found

    Predicting Next Local Appearance for Video Anomaly Detection

    Full text link
    We present a local anomaly detection method in videos. As opposed to most existing methods that are computationally expensive and are not very generalizable across different video scenes, we propose an adversarial framework that learns the temporal local appearance variations by predicting the appearance of a normally behaving object in the next frame of a scene by only relying on its current and past appearances. In the presence of an abnormally behaving object, the reconstruction error between the real and the predicted next appearance of that object indicates the likelihood of an anomaly. Our method is competitive with the existing state-of-the-art while being significantly faster for both training and inference and being better at generalizing to unseen video scenes.Comment: Accepted as an oral presentation for MVA'202

    CTR: Contrastive Training Recognition Classifier for Few-Shot Open-World Recognition

    Get PDF

    Suspicious Behavior Detection with Temporal Feature Extraction and Time-Series Classification for Shoplifting Crime Prevention

    Get PDF
    The rise in crime rates in many parts of the world, coupled with advancements in computer vision, has increased the need for automated crime detection services. To address this issue, we propose a new approach for detecting suspicious behavior as a means of preventing shoplifting. Existing methods are based on the use of convolutional neural networks that rely on extracting spatial features from pixel values. In contrast, our proposed method employs object detection based on YOLOv5 with Deep Sort to track people through a video, using the resulting bounding box coordinates as temporal features. The extracted temporal features are then modeled as a time-series classification problem. The proposed method was tested on the popular UCF Crime dataset, and benchmarked against the current state-of-the-art robust temporal feature magnitude (RTFM) method, which relies on the Inflated 3D ConvNet (I3D) preprocessing method. Our results demonstrate an impressive 8.45-fold increase in detection inference speed compared to the state-of-the-art RTFM, along with an F1 score of 92%,outperforming RTFM by 3%. Furthermore, our method achieved these results without requiring expensive data augmentation or image feature extraction

    Anomaly Detection under Distribution Shift

    Full text link
    Anomaly detection (AD) is a crucial machine learning task that aims to learn patterns from a set of normal training samples to identify abnormal samples in test data. Most existing AD studies assume that the training and test data are drawn from the same data distribution, but the test data can have large distribution shifts arising in many real-world applications due to different natural variations such as new lighting conditions, object poses, or background appearances, rendering existing AD methods ineffective in such cases. In this paper, we consider the problem of anomaly detection under distribution shift and establish performance benchmarks on four widely-used AD and out-of-distribution (OOD) generalization datasets. We demonstrate that simple adaptation of state-of-the-art OOD generalization methods to AD settings fails to work effectively due to the lack of labeled anomaly data. We further introduce a novel robust AD approach to diverse distribution shifts by minimizing the distribution gap between in-distribution and OOD normal samples in both the training and inference stages in an unsupervised way. Our extensive empirical results on the four datasets show that our approach substantially outperforms state-of-the-art AD methods and OOD generalization methods on data with various distribution shifts, while maintaining the detection accuracy on in-distribution data. Code and data are available at https://github.com/mala-lab/ADShift.Comment: Accepted at ICCV 202

    Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models

    Full text link
    Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos. While existing reviews predominantly concentrate on conventional unsupervised methods, they often overlook the emergence of weakly-supervised and fully-unsupervised approaches. To address this gap, this survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED). By skillfully incorporating recent advancements rooted in diverse assumptions and learning frameworks, this survey introduces an intuitive taxonomy that seamlessly navigates through unsupervised, weakly-supervised, supervised and fully-unsupervised VAD methodologies, elucidating the distinctions and interconnections within these research trajectories. In addition, this survey facilitates prospective researchers by assembling a compilation of research resources, including public datasets, available codebases, programming tools, and pertinent literature. Furthermore, this survey quantitatively assesses model performance, delves into research challenges and directions, and outlines potential avenues for future exploration.Comment: Accepted by ACM Computing Surveys. For more information, please see our project page: https://github.com/fudanyliu/GVAE

    Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network

    Get PDF
    The widespread adoption of city surveillance systems has led to an increase in the use of surveillance videos for maintaining public safety and security. This thesis tackles the problem of detecting anomalous events in surveillance videos. The goal is to automatically identify abnormal events by learning from both normal and abnormal videos. Most of previous works consider any deviation from learned normal patterns as an anomaly, which may not always be valid since the same activity could be normal or abnormal under different circumstances. To address this issue, the thesis utilizes the Two-Stream Inflated 3D (I3D) Convolutional Networks to extract spatial and temporal video features and demonstrates how it outperforms the 3D Convolutional Network (C3D) used in prior work as feature extractor. To avoid annotating abnormal activities in training videos, a weakly supervised anomaly detection model is implemented based on the Multiple Instance Learning (MIL) framework. The model considers normal and abnormal videos as bags and video clips as instances, learns a ranking model to predict high anomaly scores for video clips containing anomalies. The thesis further shows that the choice of features input, such as concatenating RGB and flow features, and careful choice of optimization settings, such as optimizer, can significantly improve the performance of the anomaly detection model on some evaluation metrics
    corecore