724 research outputs found
Understanding Vehicular Traffic Behavior from Video: A Survey of Unsupervised Approaches
Recent emerging trends for automatic behavior analysis and understanding from infrastructure video are reviewed. Research has shifted from high-resolution estimation of vehicle state and instead, pushed machine learning approaches to extract meaningful patterns in aggregates in an unsupervised fashion. These patterns represent priors on observable motion, which can be utilized to describe a scene, answer behavior questions such as where is a vehicle going, how many vehicles are performing the same action, and to detect an abnormal event. The review focuses on two main methods for scene description, trajectory clustering and topic modeling. Example applications that utilize the behavioral modeling techniques are also presented. In addition, the most popular public datasets for behavioral analysis are presented. Discussion and comment on future directions in the field are also provide
Real-time Anomaly Detection and Localization in Crowded Scenes
In this paper, we propose a method for real-time anomaly
detection and localization in crowded scenes. Each video is
defined as a set of non-overlapping cubic patches, and is
described using two local and global descriptors. These
descriptors capture the video properties from different aspects.
By incorporating simple and cost-effective Gaussian
classifiers, we can distinguish normal activities and anomalies
in videos. The local and global features are based on
structure similarity between adjacent patches and the features
learned in an unsupervised way, using a sparse autoencoder.
Experimental results show that our algorithm is
comparable to a state-of-the-art procedure on UCSD ped2
and UMN benchmarks, but even more time-efficient. The
experiments confirm that our system can reliably detect and
localize anomalies as soon as they happen in a video
No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling
Extracting knowledge from unlabeled texts using machine learning algorithms
can be complex. Document categorization and information retrieval are two
applications that may benefit from unsupervised learning (e.g., text clustering
and topic modeling), including exploratory data analysis. However, the
unsupervised learning paradigm poses reproducibility issues. The initialization
can lead to variability depending on the machine learning algorithm.
Furthermore, the distortions can be misleading when regarding cluster geometry.
Amongst the causes, the presence of outliers and anomalies can be a determining
factor. Despite the relevance of initialization and outlier issues for text
clustering and topic modeling, the authors did not find an in-depth analysis of
them. This survey provides a systematic literature review (2011-2022) of these
subareas and proposes a common terminology since similar procedures have
different terms. The authors describe research opportunities, trends, and open
issues. The appendices summarize the theoretical background of the text
vectorization, the factorization, and the clustering algorithms that are
directly or indirectly related to the reviewed works
Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework
Since the label collecting is prohibitive and time-consuming, unsupervised
methods are preferred in applications such as fraud detection. Meanwhile, such
applications usually require modeling the intrinsic clusters in
high-dimensional data, which usually displays heterogeneous statistical
patterns as the patterns of different clusters may appear in different
dimensions. Existing methods propose to model the data clusters on selected
dimensions, yet globally omitting any dimension may damage the pattern of
certain clusters. To address the above issues, we propose a novel unsupervised
generative framework called FIRD, which utilizes adversarial distributions to
fit and disentangle the heterogeneous statistical patterns. When applying to
discrete spaces, FIRD effectively distinguishes the synchronized fraudsters
from normal users. Besides, FIRD also provides superior performance on anomaly
detection datasets compared with SOTA anomaly detection methods (over 5%
average AUC improvement). The significant experiment results on various
datasets verify that the proposed method can better model the heterogeneous
statistical patterns in high-dimensional data and benefit downstream
applications
- …