694 research outputs found
Sketch-Based Streaming Anomaly Detection in Dynamic Graphs
Given a stream of graph edges from a dynamic graph, how can we assign anomaly
scores to edges and subgraphs in an online manner, for the purpose of detecting
unusual behavior, using constant time and memory? For example, in intrusion
detection, existing work seeks to detect either anomalous edges or anomalous
subgraphs, but not both. In this paper, we first extend the count-min sketch
data structure to a higher-order sketch. This higher-order sketch has the
useful property of preserving the dense subgraph structure (dense subgraphs in
the input turn into dense submatrices in the data structure). We then propose
four online algorithms that utilize this enhanced data structure, which (a)
detect both edge and graph anomalies; (b) process each edge and graph in
constant memory and constant update time per newly arriving edge, and; (c)
outperform state-of-the-art baselines on four real-world datasets. Our method
is the first streaming approach that incorporates dense subgraph search to
detect graph anomalies in constant memory and time
MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams
Given a stream of graph edges from a dynamic graph, how can we assign anomaly
scores to edges in an online manner, for the purpose of detecting unusual
behavior, using constant time and memory? Existing approaches aim to detect
individually surprising edges. In this work, we propose MIDAS, which focuses on
detecting microcluster anomalies, or suddenly arriving groups of suspiciously
similar edges, such as lockstep behavior, including denial of service attacks
in network traffic data. MIDAS has the following properties: (a) it detects
microcluster anomalies while providing theoretical guarantees about its false
positive probability; (b) it is online, thus processing each edge in constant
time and constant memory, and also processes the data 162-644 times faster than
state-of-the-art approaches; (c) it provides 42%-48% higher accuracy (in terms
of AUC) than state-of-the-art approaches.Comment: 8 pages, Accepted at AAAI Conference on Artificial Intelligence
(AAAI), 2020 [oral paper]; minor fixes, updated experiment
SNAPSKETCH: Graph Representation Approach for Anomaly Detection in Graph Stream
A novel unsupervised graph representation approach in a graph stream called SNAPSKETCH for anomaly detection is proposed. It first performs a fixed-length random walk from each node in a network and constructs n-shingles from a walk path. The top discriminative n-shingles identified using a frequency measure are projected into a dimensional projection vector chosen uniformly at random. Finally, a network is sketched into a low-dimensional sketch vector using a simplified hashing of projection vector and the cost of shingles. Using the learned sketch vector, anomaly detection is done using the state-of-the-art anomaly detection approach called RRCF [1]. SNAPSKETCHhas several advantages: Fully unsupervised learning, Constant memory space usage, Entire-graph embedding, and Real-time anomaly detection
Anomaly detection in the dynamics of web and social networks
In this work, we propose a new, fast and scalable method for anomaly
detection in large time-evolving graphs. It may be a static graph with dynamic
node attributes (e.g. time-series), or a graph evolving in time, such as a
temporal network. We define an anomaly as a localized increase in temporal
activity in a cluster of nodes. The algorithm is unsupervised. It is able to
detect and track anomalous activity in a dynamic network despite the noise from
multiple interfering sources. We use the Hopfield network model of memory to
combine the graph and time information. We show that anomalies can be spotted
with a good precision using a memory network. The presented approach is
scalable and we provide a distributed implementation of the algorithm. To
demonstrate its efficiency, we apply it to two datasets: Enron Email dataset
and Wikipedia page views. We show that the anomalous spikes are triggered by
the real-world events that impact the network dynamics. Besides, the structure
of the clusters and the analysis of the time evolution associated with the
detected events reveals interesting facts on how humans interact, exchange and
search for information, opening the door to new quantitative studies on
collective and social behavior on large and dynamic datasets.Comment: The Web Conference 2019, 10 pages, 7 figure
Adapted K-Nearest Neighbors for Detecting Anomalies on Spatio–Temporal Traffic Flow
Outlier detection is an extensive research area, which has been intensively studied in several domains such as biological sciences, medical diagnosis, surveillance, and traffic anomaly detection. This paper explores advances in the outlier detection area by finding anomalies in spatio-temporal urban traffic flow. It proposes a new approach by considering the distribution of the flows in a given time interval. The flow distribution probability (FDP) databases are first constructed from the traffic flows by considering both spatial and temporal information. The outlier detection mechanism is then applied to the coming flow distribution probabilities, the inliers are stored to enrich the FDP databases, while the outliers are excluded from the FDP databases. Moreover, a k-nearest neighbor for distance-based outlier detection is investigated and adopted for FDP outlier detection. To validate the proposed framework, real data from Odense traffic flow case are evaluated at ten locations. The results reveal that the proposed framework is able to detect the real distribution of flow outliers. Another experiment has been carried out on Beijing data, the results show that our approach outperforms the baseline algorithms for high-urban traffic flow
- …