3,569 research outputs found
Locality statistics for anomaly detection in time series of graphs
The ability to detect change-points in a dynamic network or a time series of
graphs is an increasingly important task in many applications of the emerging
discipline of graph signal processing. This paper formulates change-point
detection as a hypothesis testing problem in terms of a generative latent
position model, focusing on the special case of the Stochastic Block Model time
series. We analyze two classes of scan statistics, based on distinct underlying
locality statistics presented in the literature. Our main contribution is the
derivation of the limiting distributions and power characteristics of the
competing scan statistics. Performance is compared theoretically, on synthetic
data, and on the Enron email corpus. We demonstrate that both statistics are
admissible in one simple setting, while one of the statistics is inadmissible a
second setting.Comment: 15 pages, 6 figure
Scan statistics for the online detection of locally anomalous subgraphs
Identifying anomalies in computer networks is a challenging and complex problem. Often, anomalies occur in extremely local areas of the network. Locality is complex in this setting, since we have an underlying graph structure. To identify local anomalies, we introduce a scan statistic for data extracted from the edges of a graph over time. In the computer network setting, the data on these edges are multivariate measures of the communications between two distinct machines, over time. We describe two shapes for capturing locality in the graph: the star and the k-path. While the star shape is not new to the literature, the path shape, when used as a scan window, appears to be novel. Both of these shapes are motivated by hacker behaviors observed in real attacks. A hacker who is using a single central machine to examine other machines creates a star-shaped anomaly on the edges emanating from the central node. Paths represent traversal of a hacker through a network, using a set of machines in sequence. To identify local anomalies, these shapes are enumerated over the entire graph, over a set of sliding time windows. Local statistics in each window are compared with their historic behavior to capture anomalies within the window. These local statistics are model-based. To capture the communications between computers, we have applied two different models, observed and hidden Markov models, to each edge in the network. These models have been effective in handling various aspects of this type of data, but do not completely describe the data. Therefore, we also present ongoing work in the modeling of host-to-host communications in a computer network. Data speeds on larger networks require online detection to be nimble. We describe a full anomaly detection system, which has been applied to a corporate sized network and achieves better than real-time analysis speed. We present results on simulated data whose parameters were estimated from real network data. In addition, we present a result from our analysis of a real, corporate-sized network data set. These results are very encouraging, since the detection corresponded to exactly the type of behavior we hope to detect
A Graph Encoder-Decoder Network for Unsupervised Anomaly Detection
A key component of many graph neural networks (GNNs) is the pooling
operation, which seeks to reduce the size of a graph while preserving important
structural information. However, most existing graph pooling strategies rely on
an assignment matrix obtained by employing a GNN layer, which is characterized
by trainable parameters, often leading to significant computational complexity
and a lack of interpretability in the pooling process. In this paper, we
propose an unsupervised graph encoder-decoder model to detect abnormal nodes
from graphs by learning an anomaly scoring function to rank nodes based on
their degree of abnormality. In the encoding stage, we design a novel pooling
mechanism, named LCPool, which leverages locality-constrained linear coding for
feature encoding to find a cluster assignment matrix by solving a least-squares
optimization problem with a locality regularization term. By enforcing locality
constraints during the coding process, LCPool is designed to be free from
learnable parameters, capable of efficiently handling large graphs, and can
effectively generate a coarser graph representation while retaining the most
significant structural characteristics of the graph. In the decoding stage, we
propose an unpooling operation, called LCUnpool, to reconstruct both the
structure and nodal features of the original graph. We conduct empirical
evaluations of our method on six benchmark datasets using several evaluation
metrics, and the results demonstrate its superiority over state-of-the-art
anomaly detection approaches
- …