535 research outputs found
MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams
Given a stream of graph edges from a dynamic graph, how can we assign anomaly
scores to edges in an online manner, for the purpose of detecting unusual
behavior, using constant time and memory? Existing approaches aim to detect
individually surprising edges. In this work, we propose MIDAS, which focuses on
detecting microcluster anomalies, or suddenly arriving groups of suspiciously
similar edges, such as lockstep behavior, including denial of service attacks
in network traffic data. MIDAS has the following properties: (a) it detects
microcluster anomalies while providing theoretical guarantees about its false
positive probability; (b) it is online, thus processing each edge in constant
time and constant memory, and also processes the data 162-644 times faster than
state-of-the-art approaches; (c) it provides 42%-48% higher accuracy (in terms
of AUC) than state-of-the-art approaches.Comment: 8 pages, Accepted at AAAI Conference on Artificial Intelligence
(AAAI), 2020 [oral paper]; minor fixes, updated experiment
Sketch-Based Streaming Anomaly Detection in Dynamic Graphs
Given a stream of graph edges from a dynamic graph, how can we assign anomaly
scores to edges and subgraphs in an online manner, for the purpose of detecting
unusual behavior, using constant time and memory? For example, in intrusion
detection, existing work seeks to detect either anomalous edges or anomalous
subgraphs, but not both. In this paper, we first extend the count-min sketch
data structure to a higher-order sketch. This higher-order sketch has the
useful property of preserving the dense subgraph structure (dense subgraphs in
the input turn into dense submatrices in the data structure). We then propose
four online algorithms that utilize this enhanced data structure, which (a)
detect both edge and graph anomalies; (b) process each edge and graph in
constant memory and constant update time per newly arriving edge, and; (c)
outperform state-of-the-art baselines on four real-world datasets. Our method
is the first streaming approach that incorporates dense subgraph search to
detect graph anomalies in constant memory and time
Topological Anomaly Detection in Dynamic Multilayer Blockchain Networks
Motivated by the recent surge of criminal activities with
cross-cryptocurrency trades, we introduce a new topological perspective to
structural anomaly detection in dynamic multilayer networks. We postulate that
anomalies in the underlying blockchain transaction graph that are composed of
multiple layers are likely to also be manifested in anomalous patterns of the
network shape properties. As such, we invoke the machinery of clique persistent
homology on graphs to systematically and efficiently track evolution of the
network shape and, as a result, to detect changes in the underlying network
topology and geometry. We develop a new persistence summary for multilayer
networks, called stacked persistence diagram, and prove its stability under
input data perturbations. We validate our new topological anomaly detection
framework in application to dynamic multilayer networks from the Ethereum
Blockchain and the Ripple Credit Network, and demonstrate that our stacked PD
approach substantially outperforms state-of-art techniques.Comment: 26 pages, 6 figures, 7 table
Raising the Bar in Graph-level Anomaly Detection
Graph-level anomaly detection has become a critical topic in diverse areas,
such as financial fraud detection and detecting anomalous activities in social
networks. While most research has focused on anomaly detection for visual data
such as images, where high detection accuracies have been obtained, existing
deep learning approaches for graphs currently show considerably worse
performance. This paper raises the bar on graph-level anomaly detection, i.e.,
the task of detecting abnormal graphs in a set of graphs. By drawing on ideas
from self-supervised learning and transformation learning, we present a new
deep learning approach that significantly improves existing deep one-class
approaches by fixing some of their known problems, including hypersphere
collapse and performance flip. Experiments on nine real-world data sets
involving nine techniques reveal that our method achieves an average
performance improvement of 11.8% AUC compared to the best existing approach.Comment: To appear in IJCAI-ECAI 202
3D-IDS: Doubly Disentangled Dynamic Intrusion Detection
Network-based intrusion detection system (NIDS) monitors network traffic for
malicious activities, forming the frontline defense against increasing attacks
over information infrastructures. Although promising, our quantitative analysis
shows that existing methods perform inconsistently in declaring various unknown
attacks (e.g., 9% and 35% F1 respectively for two distinct unknown threats for
an SVM-based method) or detecting diverse known attacks (e.g., 31% F1 for the
Backdoor and 93% F1 for DDoS by a GCN-based state-of-the-art method), and
reveals that the underlying cause is entangled distributions of flow features.
This motivates us to propose 3D-IDS, a novel method that aims to tackle the
above issues through two-step feature disentanglements and a dynamic graph
diffusion scheme. Specifically, we first disentangle traffic features by a
non-parameterized optimization based on mutual information, automatically
differentiating tens and hundreds of complex features of various attacks. Such
differentiated features will be fed into a memory model to generate
representations, which are further disentangled to highlight the
attack-specific features. Finally, we use a novel graph diffusion method that
dynamically fuses the network topology for spatial-temporal aggregation in
evolving data streams. By doing so, we can effectively identify various attacks
in encrypted traffics, including unknown threats and known ones that are not
easily detected. Experiments show the superiority of our 3D-IDS. We also
demonstrate that our two-step feature disentanglements benefit the
explainability of NIDS.Comment: Accepted and appeared in the proceedings of the KDD 2023 Research
Trac
A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions
Graphs represent interconnected structures prevalent in a myriad of
real-world scenarios. Effective graph analytics, such as graph learning
methods, enables users to gain profound insights from graph data, underpinning
various tasks including node classification and link prediction. However, these
methods often suffer from data imbalance, a common issue in graph data where
certain segments possess abundant data while others are scarce, thereby leading
to biased learning outcomes. This necessitates the emerging field of imbalanced
learning on graphs, which aims to correct these data distribution skews for
more accurate and representative learning outcomes. In this survey, we embark
on a comprehensive review of the literature on imbalanced learning on graphs.
We begin by providing a definitive understanding of the concept and related
terminologies, establishing a strong foundational understanding for readers.
Following this, we propose two comprehensive taxonomies: (1) the problem
taxonomy, which describes the forms of imbalance we consider, the associated
tasks, and potential solutions; (2) the technique taxonomy, which details key
strategies for addressing these imbalances, and aids readers in their method
selection process. Finally, we suggest prospective future directions for both
problems and techniques within the sphere of imbalanced learning on graphs,
fostering further innovation in this critical area.Comment: The collection of awesome literature on imbalanced learning on
graphs: https://github.com/Xtra-Computing/Awesome-Literature-ILoG
- …