6 research outputs found
Sketch-Based Streaming Anomaly Detection in Dynamic Graphs
Given a stream of graph edges from a dynamic graph, how can we assign anomaly
scores to edges and subgraphs in an online manner, for the purpose of detecting
unusual behavior, using constant time and memory? For example, in intrusion
detection, existing work seeks to detect either anomalous edges or anomalous
subgraphs, but not both. In this paper, we first extend the count-min sketch
data structure to a higher-order sketch. This higher-order sketch has the
useful property of preserving the dense subgraph structure (dense subgraphs in
the input turn into dense submatrices in the data structure). We then propose
four online algorithms that utilize this enhanced data structure, which (a)
detect both edge and graph anomalies; (b) process each edge and graph in
constant memory and constant update time per newly arriving edge, and; (c)
outperform state-of-the-art baselines on four real-world datasets. Our method
is the first streaming approach that incorporates dense subgraph search to
detect graph anomalies in constant memory and time
PRESTO: Simple and Scalable Sampling Techniques for the Rigorous Approximation of Temporal Motif Counts
The identification and counting of small graph patterns, called network
motifs, is a fundamental primitive in the analysis of networks, with
application in various domains, from social networks to neuroscience. Several
techniques have been designed to count the occurrences of motifs in static
networks, with recent work focusing on the computational challenges provided by
large networks. Modern networked datasets contain rich information, such as the
time at which the events modeled by the networks edges happened, which can
provide useful insights into the process modeled by the network. The analysis
of motifs in temporal networks, called temporal motifs, is becoming an
important component in the analysis of modern networked datasets. Several
methods have been recently designed to count the number of instances of
temporal motifs in temporal networks, which is even more challenging than its
counterpart for static networks. Such methods are either exact, and not
applicable to large networks, or approximate, but provide only weak guarantees
on the estimates they produce and do not scale to very large networks. In this
work we present an efficient and scalable algorithm to obtain rigorous
approximations of the count of temporal motifs. Our algorithm is based on a
simple but effective sampling approach, which renders our algorithm practical
for very large datasets. Our extensive experimental evaluation shows that our
algorithm provides estimates of temporal motif counts which are more accurate
than the state-of-the-art sampling algorithms, with significantly lower running
time than exact approaches, enabling the study of temporal motifs, of size
larger than the ones considered in previous works, on billion edges networks.Comment: 19 pages, 5 figures, to appear in SDM 202