504 research outputs found
Efficient Sampling Algorithms for Approximate Motif Counting in Temporal Graph Streams
A great variety of complex systems, from user interactions in communication
networks to transactions in financial markets, can be modeled as temporal
graphs consisting of a set of vertices and a series of timestamped and directed
edges. Temporal motifs are generalized from subgraph patterns in static graphs
which consider edge orderings and durations in addition to topologies. Counting
the number of occurrences of temporal motifs is a fundamental problem for
temporal network analysis. However, existing methods either cannot support
temporal motifs or suffer from performance issues. Moreover, they cannot work
in the streaming model where edges are observed incrementally over time. In
this paper, we focus on approximate temporal motif counting via random
sampling. We first propose two sampling algorithms for temporal motif counting
in the offline setting. The first is an edge sampling (ES) algorithm for
estimating the number of instances of any temporal motif. The second is an
improved edge-wedge sampling (EWS) algorithm that hybridizes edge sampling with
wedge sampling for counting temporal motifs with vertices and edges.
Furthermore, we propose two algorithms to count temporal motifs incrementally
in temporal graph streams by extending the ES and EWS algorithms referred to as
SES and SEWS. We provide comprehensive analyses of the theoretical bounds and
complexities of our proposed algorithms. Finally, we perform extensive
experimental evaluations of our proposed algorithms on several real-world
temporal graphs. The results show that ES and EWS have higher efficiency,
better accuracy, and greater scalability than state-of-the-art sampling methods
for temporal motif counting in the offline setting. Moreover, SES and SEWS
achieve up to three orders of magnitude speedups over ES and EWS while having
comparable estimation errors for temporal motif counting in the streaming
setting.Comment: 27 pages, 11 figures; overlapped with arXiv:2007.1402
Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search
Recent advances have shown the success of using reinforcement learning and
search to solve NP-hard graph-related tasks, such as Traveling Salesman
Optimization, Graph Edit Distance computation, etc. However, it remains unclear
how one can efficiently and accurately detect the occurrences of a small query
graph in a large target graph, which is a core operation in graph database
search, biomedical analysis, social group finding, etc. This task is called
Subgraph Matching which essentially performs subgraph isomorphism check between
a query graph and a large target graph. One promising approach to this
classical problem is the "learning-to-search" paradigm, where a reinforcement
learning (RL) agent is designed with a learned policy to guide a search
algorithm to quickly find the solution without any solved instances for
supervision. However, for the specific task of Subgraph Matching, though the
query graph is usually small given by the user as input, the target graph is
often orders-of-magnitude larger. It poses challenges to the neural network
design and can lead to solution and reward sparsity. In this paper, we propose
NSUBS with two innovations to tackle the challenges: (1) A novel
encoder-decoder neural network architecture to dynamically compute the matching
information between the query and the target graphs at each search state; (2) A
novel look-ahead loss function for training the policy network. Experiments on
six large real-world target graphs show that NSUBS can significantly improve
the subgraph matching performance
Efficient Algorithms for Node Disjoint Subgraph Homeomorphism Determination
Recently, great efforts have been dedicated to researches on the management
of large scale graph based data such as WWW, social networks, biological
networks. In the study of graph based data management, node disjoint subgraph
homeomorphism relation between graphs is more suitable than (sub)graph
isomorphism in many cases, especially in those cases that node skipping and
node mismatching are allowed. However, no efficient node disjoint subgraph
homeomorphism determination (ndSHD) algorithms have been available. In this
paper, we propose two computationally efficient ndSHD algorithms based on state
spaces searching with backtracking, which employ many heuristics to prune the
search spaces. Experimental results on synthetic data sets show that the
proposed algorithms are efficient, require relative little time in most of the
testing cases, can scale to large or dense graphs, and can accommodate to more
complex fuzzy matching cases.Comment: 15 pages, 11 figures, submitted to DASFAA 200
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra
We propose GraphMineSuite (GMS): the first benchmarking suite for graph
mining that facilitates evaluating and constructing high-performance graph
mining algorithms. First, GMS comes with a benchmark specification based on
extensive literature review, prescribing representative problems, algorithms,
and datasets. Second, GMS offers a carefully designed software platform for
seamless testing of different fine-grained elements of graph mining algorithms,
such as graph representations or algorithm subroutines. The platform includes
parallel implementations of more than 40 considered baselines, and it
facilitates developing complex and fast mining algorithms. High modularity is
possible by harnessing set algebra operations such as set intersection and
difference, which enables breaking complex graph mining algorithms into simple
building blocks that can be separately experimented with. GMS is supported with
a broad concurrency analysis for portability in performance insights, and a
novel performance metric to assess the throughput of graph mining algorithms,
enabling more insightful evaluation. As use cases, we harness GMS to rapidly
redesign and accelerate state-of-the-art baselines of core graph mining
problems: degeneracy reordering (by up to >2x), maximal clique listing (by up
to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x),
also obtaining better theoretical performance bounds
Parallelizing Maximal Clique Enumeration on GPUs
We present a GPU solution for exact maximal clique enumeration (MCE) that
performs a search tree traversal following the Bron-Kerbosch algorithm. Prior
works on parallelizing MCE on GPUs perform a breadth-first traversal of the
tree, which has limited scalability because of the explosion in the number of
tree nodes at deep levels. We propose to parallelize MCE on GPUs by performing
depth-first traversal of independent subtrees in parallel. Since MCE suffers
from high load imbalance and memory capacity requirements, we propose a worker
list for dynamic load balancing, as well as partial induced subgraphs and a
compact representation of excluded vertex sets to regulate memory consumption.
Our evaluation shows that our GPU implementation on a single GPU outperforms
the state-of-the-art parallel CPU implementation by a geometric mean of 4.9x
(up to 16.7x), and scales efficiently to multiple GPUs. Our code has been
open-sourced to enable further research on accelerating MCE
- …