Search CORE

14,611 research outputs found

Energy-based Models for Video Anomaly Detection

Author: Nguyen Tu Dinh
Phung Dinh
Trevors Anthony
Venkatesh Svetha
Vu Hung
Publication venue
Publication date: 17/08/2017
Field of study

Automated detection of abnormalities in data has been studied in research area in recent years because of its diverse applications in practice including video surveillance, industrial damage detection and network intrusion detection. However, building an effective anomaly detection system is a non-trivial task since it requires to tackle challenging issues of the shortage of annotated data, inability of defining anomaly objects explicitly and the expensive cost of feature engineering procedure. Unlike existing appoaches which only partially solve these problems, we develop a unique framework to cope the problems above simultaneously. Instead of hanlding with ambiguous definition of anomaly objects, we propose to work with regular patterns whose unlabeled data is abundant and usually easy to collect in practice. This allows our system to be trained completely in an unsupervised procedure and liberate us from the need for costly data annotation. By learning generative model that capture the normality distribution in data, we can isolate abnormal data points that result in low normality scores (high abnormality scores). Moreover, by leverage on the power of generative networks, i.e. energy-based models, we are also able to learn the feature representation automatically rather than replying on hand-crafted features that have been dominating anomaly detection research over many decades. We demonstrate our proposal on the specific application of video anomaly detection and the experimental results indicate that our method performs better than baselines and are comparable with state-of-the-art methods in many benchmark video anomaly detection datasets

arXiv.org e-Print Archive

A Review of Co-saliency Detection Technique: Fundamentals, Applications, and Challenges

Author: Borji Ali
Fu Huazhu
Han Junwei
Li Xuelong
Zhang Dingwen
Publication venue
Publication date: 09/07/2017
Field of study

Co-saliency detection is a newly emerging and rapidly growing research area in computer vision community. As a novel branch of visual saliency, co-saliency detection refers to the discovery of common and salient foregrounds from two or more relevant images, and can be widely used in many computer vision tasks. The existing co-saliency detection algorithms mainly consist of three components: extracting effective features to represent the image regions, exploring the informative cues or factors to characterize co-saliency, and designing effective computational frameworks to formulate co-saliency. Although numerous methods have been developed, the literature is still lacking a deep review and evaluation of co-saliency detection techniques. In this paper, we aim at providing a comprehensive review of the fundamentals, challenges, and applications of co-saliency detection. Specifically, we provide an overview of some related computer vision works, review the history of co-saliency detection, summarize and categorize the major algorithms in this research area, discuss some open issues in this area, present the potential applications of co-saliency detection, and finally point out some unsolved challenges and promising future works. We expect this review to be beneficial to both fresh and senior researchers in this field, and give insights to researchers in other related areas regarding the utility of co-saliency detection algorithms.Comment: 28 pages, 12 figures, 3 table

arXiv.org e-Print Archive

Crowded Scene Analysis: A Survey

Author: Chang Huan
Hong Richang
Li Teng
Ni Bingbing
Wang Meng
Yan Shuicheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/02/2015
Field of study

Automated scene analysis has been a topic of great interest in computer vision and cognitive science. Recently, with the growth of crowd phenomena in the real world, crowded scene analysis has attracted much attention. However, the visual occlusions and ambiguities in crowded scenes, as well as the complex behaviors and scene semantics, make the analysis a challenging task. In the past few years, an increasing number of works on crowded scene analysis have been reported, covering different aspects including crowd motion pattern learning, crowd behavior and activity analysis, and anomaly detection in crowds. This paper surveys the state-of-the-art techniques on this topic. We first provide the background knowledge and the available features related to crowded scenes. Then, existing models, popular algorithms, evaluation protocols, as well as system performance are provided corresponding to different aspects of crowded scene analysis. We also outline the available datasets for performance evaluation. Finally, some research problems and promising future directions are presented with discussions.Comment: 20 pages in IEEE Transactions on Circuits and Systems for Video Technology, 201

arXiv.org e-Print Archive

Spatio-Temporal Data Mining: A Survey of Problems and Methods

Author: Atluri Gowtham
Karpatne Anuj
Kumar Vipin
Publication venue
Publication date: 17/11/2017
Field of study

Large volumes of spatio-temporal data are increasingly collected and studied in diverse domains including, climate science, social sciences, neuroscience, epidemiology, transportation, mobile health, and Earth sciences. Spatio-temporal data differs from relational data for which computational approaches are developed in the data mining community for multiple decades, in that both spatial and temporal attributes are available in addition to the actual measurements/attributes. The presence of these attributes introduces additional challenges that needs to be dealt with. Approaches for mining spatio-temporal data have been studied for over a decade in the data mining community. In this article we present a broad survey of this relatively young field of spatio-temporal data mining. We discuss different types of spatio-temporal data and the relevant data mining questions that arise in the context of analyzing each of these datasets. Based on the nature of the data mining problem studied, we classify literature on spatio-temporal data mining into six major categories: clustering, predictive learning, change detection, frequent pattern mining, anomaly detection, and relationship mining. We discuss the various forms of spatio-temporal data mining problems in each of these categories.Comment: Accepted for publication at ACM Computing Survey

arXiv.org e-Print Archive

AED-Net: An Abnormal Event Detection Network

Author: Chen Yuxin
Miao Zichen
Shan Guangcun
Snoussi Hichem
Wang Tian
Zhou Yi
Publication venue
Publication date: 28/03/2019
Field of study

It is challenging to detect the anomaly in crowded scenes for quite a long time. In this paper, a self-supervised framework, abnormal event detection network (AED-Net), which is composed of PCAnet and kernel principal component analysis (kPCA), is proposed to address this problem. Using surveillance video sequences of different scenes as raw data, PCAnet is trained to extract high-level semantics of crowd's situation. Next, kPCA,a one-class classifier, is trained to determine anomaly of the scene. In contrast to some prevailing deep learning methods,the framework is completely self-supervised because it utilizes only video sequences in a normal situation. Experiments of global and local abnormal event detection are carried out on UMN and UCSD datasets, and competitive results with higher EER and AUC compared to other state-of-the-art methods are observed. Furthermore, by adding local response normalization (LRN) layer, we propose an improvement to original AED-Net. And it is proved to perform better by promoting the framework's generalization capacity according to the experiments.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

Unsupervised Video Analysis Based on a Spatiotemporal Saliency Detector

Author: Li Baoxin
Wang Yilin
Zhang Qiang
Publication venue
Publication date: 24/03/2015
Field of study

Visual saliency, which predicts regions in the field of view that draw the most visual attention, has attracted a lot of interest from researchers. It has already been used in several vision tasks, e.g., image classification, object detection, foreground segmentation. Recently, the spectrum analysis based visual saliency approach has attracted a lot of interest due to its simplicity and good performance, where the phase information of the image is used to construct the saliency map. In this paper, we propose a new approach for detecting spatiotemporal visual saliency based on the phase spectrum of the videos, which is easy to implement and computationally efficient. With the proposed algorithm, we also study how the spatiotemporal saliency can be used in two important vision task, abnormality detection and spatiotemporal interest point detection. The proposed algorithm is evaluated on several commonly used datasets with comparison to the state-of-art methods from the literature. The experiments demonstrate the effectiveness of the proposed approach to spatiotemporal visual saliency detection and its application to the above vision tasksComment: 21 page

arXiv.org e-Print Archive

Ego-Surfing: Person Localization in First-Person Videos Using Ego-Motion Signatures

Author: Kitani Kris M.
Sato Yoichi
Yonetani Ryo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/11/2017
Field of study

We envision a future time when wearable cameras are worn by the masses and recording first-person point-of-view videos of everyday life. While these cameras can enable new assistive technologies and novel research challenges, they also raise serious privacy concerns. For example, first-person videos passively recorded by wearable cameras will necessarily include anyone who comes into the view of a camera -- with or without consent. Motivated by these benefits and risks, we developed a self-search technique tailored to first-person videos. The key observation of our work is that the egocentric head motion of a target person (ie, the self) is observed both in the point-of-view video of the target and observer. The motion correlation between the target person's video and the observer's video can then be used to identify instances of the self uniquely. We incorporate this feature into the proposed approach that computes the motion correlation over densely-sampled trajectories to search for a target individual in observer videos. Our approach significantly improves self-search performance over several well-known face detectors and recognizers. Furthermore, we show how our approach can enable several practical applications such as privacy filtering, target video retrieval, and social group clustering.Comment: To appear in IEEE TPAM

arXiv.org e-Print Archive

Monitoring COVID-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques

Author: Agarwal Sonali
Punn Narinder Singh
Sonbhadra Sanjay Kumar
Publication venue
Publication date: 06/05/2020
Field of study

The rampant coronavirus disease 2019 (COVID-19) has brought global crisis with its deadly spread to more than 180 countries, and about 3,519,901 confirmed cases along with 247,630 deaths globally as on May 4, 2020. The absence of any active therapeutic agents and the lack of immunity against COVID-19 increases the vulnerability of the population. Since there are no vaccines available, social distancing is the only feasible approach to fight against this pandemic. Motivated by this notion, this article proposes a deep learning based framework for automating the task of monitoring social distancing using surveillance video. The proposed framework utilizes the YOLO v3 object detection model to segregate humans from the background and Deepsort approach to track the identified people with the help of bounding boxes and assigned IDs. The results of the YOLO v3 model are further compared with other popular state-of-the-art models, e.g. faster region-based CNN (convolution neural network) and single shot detector (SSD) in terms of mean average precision (mAP), frames per second (FPS) and loss values defined by object classification and localization. Later, the pairwise vectorized L2 norm is computed based on the three-dimensional feature space obtained by using the centroid coordinates and dimensions of the bounding box. The violation index term is proposed to quantize the non adoption of social distancing protocol. From the experimental analysis, it is observed that the YOLO v3 with Deepsort tracking scheme displayed best results with balanced mAP and FPS score to monitor the social distancing in real-time

arXiv.org e-Print Archive

A survey on trajectory clustering analysis

Author: Bian Jiang
Tang Yuanyan
Tao Dacheng
Tian Dayong
Publication venue
Publication date: 20/02/2018
Field of study

This paper comprehensively surveys the development of trajectory clustering. Considering the critical role of trajectory data mining in modern intelligent systems for surveillance security, abnormal behavior detection, crowd behavior analysis, and traffic control, trajectory clustering has attracted growing attention. Existing trajectory clustering methods can be grouped into three categories: unsupervised, supervised and semi-supervised algorithms. In spite of achieving a certain level of development, trajectory clustering is limited in its success by complex conditions such as application scenarios and data dimensions. This paper provides a holistic understanding and deep insight into trajectory clustering, and presents a comprehensive analysis of representative methods and promising future directions

arXiv.org e-Print Archive

Characterizing Human Behaviours Using Statistical Motion Descriptor

Author: Alreshidi Eissa Jaber
Bilal Mohammad
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 06/03/2019
Field of study

Identifying human behaviors is a challenging research problem due to the complexity and variation of appearances and postures, the variation of camera settings, and view angles. In this paper, we try to address the problem of human behavior identification by introducing a novel motion descriptor based on statistical features. The method first divide the video into N number of temporal segments. Then for each segment, we compute dense optical flow, which provides instantaneous velocity information for all the pixels. We then compute Histogram of Optical Flow (HOOF) weighted by the norm and quantized into 32 bins. We then compute statistical features from the obtained HOOF forming a descriptor vector of 192- dimensions. We then train a non-linear multi-class SVM that classify different human behaviors with the accuracy of 72.1%. We evaluate our method by using publicly available human action data set. Experimental results shows that our proposed method out performs state of the art methods

arXiv.org e-Print Archive