2,945 research outputs found
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
The unsupervised detection of anomalies in time series data has important
applications in user behavioral modeling, fraud detection, and cybersecurity.
Anomaly detection has, in fact, been extensively studied in categorical
sequences. However, we often have access to time series data that represent
paths through networks. Examples include transaction sequences in financial
networks, click streams of users in networks of cross-referenced documents, or
travel itineraries in transportation networks. To reliably detect anomalies, we
must account for the fact that such data contain a large number of independent
observations of paths constrained by a graph topology. Moreover, the
heterogeneity of real systems rules out frequency-based anomaly detection
techniques, which do not account for highly skewed edge and degree statistics.
To address this problem, we introduce HYPA, a novel framework for the
unsupervised detection of anomalies in large corpora of variable-length
temporal paths in a graph. HYPA provides an efficient analytical method to
detect paths with anomalous frequencies that result from nodes being traversed
in unexpected chronological order.Comment: 11 pages with 8 figures and supplementary material. To appear at SIAM
Data Mining (SDM 2020
BINet: Multi-perspective Business Process Anomaly Classification
In this paper, we introduce BINet, a neural network architecture for
real-time multi-perspective anomaly detection in business process event logs.
BINet is designed to handle both the control flow and the data perspective of a
business process. Additionally, we propose a set of heuristics for setting the
threshold of an anomaly detection algorithm automatically. We demonstrate that
BINet can be used to detect anomalies in event logs not only on a case level
but also on event attribute level. Finally, we demonstrate that a simple set of
rules can be used to utilize the output of BINet for anomaly classification. We
compare BINet to eight other state-of-the-art anomaly detection algorithms and
evaluate their performance on an elaborate data corpus of 29 synthetic and 15
real-life event logs. BINet outperforms all other methods both on the synthetic
as well as on the real-life datasets
Extending Dynamic Bayesian Networks for Anomaly Detection in Complex Logs
Checking various log files from different processes can be a tedious task as
these logs contain lots of events, each with a (possibly large) number of
attributes. We developed a way to automatically model log files and detect
outlier traces in the data. For that we extend Dynamic Bayesian Networks to
model the normal behavior found in log files. We introduce a new algorithm that
is able to learn a model of a log file starting from the data itself. The model
is capable of scoring traces even when new values or new combinations of values
appear in the log file
Entropy Causal Graphs for Multivariate Time Series Anomaly Detection
Many multivariate time series anomaly detection frameworks have been proposed
and widely applied. However, most of these frameworks do not consider intrinsic
relationships between variables in multivariate time series data, thus ignoring
the causal relationship among variables and degrading anomaly detection
performance. This work proposes a novel framework called CGAD, an entropy
Causal Graph for multivariate time series Anomaly Detection. CGAD utilizes
transfer entropy to construct graph structures that unveil the underlying
causal relationships among time series data. Weighted graph convolutional
networks combined with causal convolutions are employed to model both the
causal graph structures and the temporal patterns within multivariate time
series data. Furthermore, CGAD applies anomaly scoring, leveraging median
absolute deviation-based normalization to improve the robustness of the anomaly
identification process. Extensive experiments demonstrate that CGAD outperforms
state-of-the-art methods on real-world datasets with a 15% average improvement
based on three different multivariate time series anomaly detection metrics
- …