135 research outputs found
Anomaly Crossing: New Horizons for Video Anomaly Detection as Cross-domain Few-shot Learning
Video anomaly detection aims to identify abnormal events that occurred in
videos. Since anomalous events are relatively rare, it is not feasible to
collect a balanced dataset and train a binary classifier to solve the task.
Thus, most previous approaches learn only from normal videos using unsupervised
or semi-supervised methods. Obviously, they are limited in capturing and
utilizing discriminative abnormal characteristics, which leads to compromised
anomaly detection performance. In this paper, to address this issue, we propose
a new learning paradigm by making full use of both normal and abnormal videos
for video anomaly detection. In particular, we formulate a new learning task:
cross-domain few-shot anomaly detection, which can transfer knowledge learned
from numerous videos in the source domain to help solve few-shot abnormality
detection in the target domain. Concretely, we leverage self-supervised
training on the target normal videos to reduce the domain gap and devise a meta
context perception module to explore the video context of the event in the
few-shot setting. Our experiments show that our method significantly
outperforms baseline methods on DoTA and UCF-Crime datasets, and the new task
contributes to a more practical training paradigm for anomaly detection
Active Authentication using an Autoencoder regularized CNN-based One-Class Classifier
Active authentication refers to the process in which users are unobtrusively
monitored and authenticated continuously throughout their interactions with
mobile devices. Generally, an active authentication problem is modelled as a
one class classification problem due to the unavailability of data from the
impostor users. Normally, the enrolled user is considered as the target class
(genuine) and the unauthorized users are considered as unknown classes
(impostor). We propose a convolutional neural network (CNN) based approach for
one class classification in which a zero centered Gaussian noise and an
autoencoder are used to model the pseudo-negative class and to regularize the
network to learn meaningful feature representations for one class data,
respectively. The overall network is trained using a combination of the
cross-entropy and the reconstruction error losses. A key feature of the
proposed approach is that any pre-trained CNN can be used as the base network
for one class classification. Effectiveness of the proposed framework is
demonstrated using three publically available face-based active authentication
datasets and it is shown that the proposed method achieves superior performance
compared to the traditional one class classification methods. The source code
is available at: github.com/otkupjnoz/oc-acnn.Comment: Accepted and to appear at AFGR 201
Architecture for automatic recognition of group activities using local motions and context
Currently, the ability to automatically detect human behavior in image sequences is one of the most important challenges in the area of computer vision. Within this broad field of knowledge, the recognition of activities of people groups in public areas is receiving special attention due to its importance in many aspects including safety and security. This paper proposes a generic computer vision architecture with the ability to learn and recognize different group activities using mainly the local group’s movements. Specifically, a multi-stream deep learning architecture is proposed whose two main streams correspond to a representation based on a descriptor capable of representing the trajectory information of a sequence of images as a collection of local movements that occur in specific regions of the scene. Additional information (e.g. location, time, etc.) to strengthen the classification of activities by including it as additional streams. The proposed architecture is capable of classifying in a robust way different activities of a group as well to deal with the one-class problems. Moreover, the use of a simple descriptor that transforms a sequence of color images into a sequence of two-image streams can reduce the curse of dimensionality using a deep learning approach. The generic deep learning architecture has been evaluated with different datasets outperforming the state-of-the-art approaches providing an efficient architecture for single and multi-class classification problems
- …