Search CORE

189 research outputs found

The Power of Pivoting for Exact Clique Counting

Author: Alon Noga
Benson A.
Finocchi Irene
Marcus Dror
Seshadhri C.
Sizemore Ann
Zhao Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/01/2020
Field of study

Clique counting is a fundamental task in network analysis, and even the simplest setting of

3

-cliques (triangles) has been the center of much recent research. Getting the count of

k

-cliques for larger

k

is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires \textit{local} counts, the number of

k

-cliques per vertex/edge. Our main result is Pivoter, an algorithm that exactly counts the number of

k

-cliques, \textit{for all values of

k

}. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge

k

-clique counts (for all

k

) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called \textit{pivoting}, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics (

k

-clique counts, local counts) can be read off efficiently.Comment: 10 pages, WSDM 202

arXiv.org e-Print Archive

Beyond Triangles: A Distributed Framework for Estimating 3-profiles of Large Graphs

Author: Borokhovich Michael
Dimakis Alexandros G.
Elenberg Ethan R.
Shanmugam Karthikeyan
Publication venue
Publication date: 22/06/2015
Field of study

We study the problem of approximating the

3

-profile of a large graph.

3

-profiles are generalizations of triangle counts that specify the number of times a small graph appears as an induced subgraph of a large graph. Our algorithm uses the novel concept of

3

-profile sparsifiers: sparse graphs that can be used to approximate the full

3

-profile counts for a given large graph. Further, we study the problem of estimating local and ego

3

-profiles, two graph quantities that characterize the local neighborhood of each vertex of a graph. Our algorithm is distributed and operates as a vertex program over the GraphLab PowerGraph framework. We introduce the concept of edge pivoting which allows us to collect

2

-hop information without maintaining an explicit

2

-hop neighborhood list at each vertex. This enables the computation of all the local

3

-profiles in parallel with minimal communication. We test out implementation in several experiments scaling up to

640

cores on Amazon EC2. We find that our algorithm can estimate the

3

-profile of a graph in approximately the same time as triangle counting. For the harder problem of ego

3

-profiles, we introduce an algorithm that can estimate profiles of hundreds of thousands of vertices in parallel, in the timescale of minutes.Comment: To appear in part at KDD'1

arXiv.org e-Print Archive

CiteSeerX

Parallelizing Maximal Clique Enumeration on GPUs

Author: Almasri Mohammad
Chang Yen-Hsiang
Hajj Izzat El
Hwu Wen-mei
Nagi Rakesh
Xiong Jinjun
Publication venue
Publication date: 10/06/2022
Field of study

We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm. Prior works on parallelizing MCE on GPUs perform a breadth-first traversal of the tree, which has limited scalability because of the explosion in the number of tree nodes at deep levels. We propose to parallelize MCE on GPUs by performing depth-first traversal of independent subtrees in parallel. Since MCE suffers from high load imbalance and memory capacity requirements, we propose a worker list for dynamic load balancing, as well as partial induced subgraphs and a compact representation of excluded vertex sets to regulate memory consumption. Our evaluation shows that our GPU implementation on a single GPU outperforms the state-of-the-art parallel CPU implementation by a geometric mean of 4.9x (up to 16.7x), and scales efficiently to multiple GPUs. Our code has been open-sourced to enable further research on accelerating MCE

arXiv.org e-Print Archive

Exact Algorithms for Maximum Clique: a computational study

Author: David
David
Eugene
Garey
Janez
Knuth
Patrick Prosser
Randy
Renato
Zweig
Publication venue
Publication date: 01/01/2012
Field of study

We investigate a number of recently reported exact algorithms for the maximum clique problem (MCQ, MCR, MCS, BBMC). The program code used is presented and critiqued showing how small changes in implementation can have a drastic effect on performance. The computational study demonstrates how problem features and hardware platforms influence algorithm behaviour. The minimum width order (smallest-last) is investigated, and MCS is broken into its consituent parts and we discover that one of these parts degrades performance. It is shown that the standard procedure used for rescaling published results is unsafe.Comment: 40 pages, 14 figures, 10 tables, 12 short java program listings, code afailable to download at http://www.dcs.gla.ac.uk/~pat/maxClique/distribution

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Enlighten

Efficient and Scalable Listing of Four-Vertex Subgraph

Author: Xia Xiangzhou
Publication venue
Publication date: 18/01/2019
Field of study

Identifying four-vertex subgraphs has long been recognized as a fundamental technique in bioinformatics and social networks. However, listing these structures is a challenging task, especially for graphs that do not fit in RAM. To address this problem, we build a set of algorithms, models, and implementations that can handle massive graphs on commodity hardware. Our technique achieves 4 – 5 orders of magnitude speedup compared to the best prior methods on graphs with billions of edges, with external-memory operation equally efficient

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

Author: Balla Adrian
Beranek Jakub
Besta Maciej
Copik Marcin
Gianinazzi Lukas
Hoefler Torsten
Holenstein Tobias
Janda Kacper
Kalvoda Pavel
Konieczny Marek
Kwasniewski Grzegorz
Leisinger Sebastian
Lindenberger Philipp
Mutlu Onur
Ozdemir Esref
Schaffner Yannick
Schwarz Leonardo
Tatkowski Peter
Vonarburg-Shmaria Zur
Publication venue
Publication date: 05/03/2021
Field of study

We propose GraphMineSuite (GMS): the first benchmarking suite for graph mining that facilitates evaluating and constructing high-performance graph mining algorithms. First, GMS comes with a benchmark specification based on extensive literature review, prescribing representative problems, algorithms, and datasets. Second, GMS offers a carefully designed software platform for seamless testing of different fine-grained elements of graph mining algorithms, such as graph representations or algorithm subroutines. The platform includes parallel implementations of more than 40 considered baselines, and it facilitates developing complex and fast mining algorithms. High modularity is possible by harnessing set algebra operations such as set intersection and difference, which enables breaking complex graph mining algorithms into simple building blocks that can be separately experimented with. GMS is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graph mining algorithms, enabling more insightful evaluation. As use cases, we harness GMS to rapidly redesign and accelerate state-of-the-art baselines of core graph mining problems: degeneracy reordering (by up to >2x), maximal clique listing (by up to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x), also obtaining better theoretical performance bounds

arXiv.org e-Print Archive

Repository for Publications and Research Data

Graph Sketching Against Adaptive Adversaries Applied to the Minimum Degree Algorithm

Author: Fahrbach Matthew
Miller Gary L.
Peng Richard
Sawlani Saurabh
Wang Junxing
Xu Shen Chen
Publication venue
Publication date: 11/04/2018
Field of study

Motivated by the study of matrix elimination orderings in combinatorial scientific computing, we utilize graph sketching and local sampling to give a data structure that provides access to approximate fill degrees of a matrix undergoing elimination in

O(\text{polylog}(n))

time per elimination and query. We then study the problem of using this data structure in the minimum degree algorithm, which is a widely-used heuristic for producing elimination orderings for sparse matrices by repeatedly eliminating the vertex with (approximate) minimum fill degree. This leads to a nearly-linear time algorithm for generating approximate greedy minimum degree orderings. Despite extensive studies of algorithms for elimination orderings in combinatorial scientific computing, our result is the first rigorous incorporation of randomized tools in this setting, as well as the first nearly-linear time algorithm for producing elimination orderings with provable approximation guarantees. While our sketching data structure readily works in the oblivious adversary model, by repeatedly querying and greedily updating itself, it enters the adaptive adversarial model where the underlying sketches become prone to failure due to dependency issues with their internal randomness. We show how to use an additional sampling procedure to circumvent this problem and to create an independent access sequence. Our technique for decorrelating the interleaved queries and updates to this randomized data structure may be of independent interest.Comment: 58 pages, 3 figures. This is a substantially revised version of arXiv:1711.08446 with an emphasis on the underlying theoretical problem

arXiv.org e-Print Archive

Exact and approximate route set generation for resilient partial observability in sensor location problems

Author: Rinaldi Marco
Viti Francesco
Publication venue
Publication date: 01/01/2017
Field of study

Sensor positioning is a fundamental problem in transportation networks, as the location of sensors strongly determines how traffic flows are observable and hence manageable. This paper aims to develop a methodology to determine sensor locations on a network such that an optimal trade-off solution is found between the amount of sensors installed and the resilience of the sensor set. In particular, we propose exact and heuristic solutions for identifying the optimal route sets such that no other route would include any additional information for finding optimal full and partial observability solutions. This is an important contribution to sensor location problems, as route-based link flow inference problems have non-unique solutions, strongly depending on the used link-route information. The properties of the new methodology are analyzed and illustrated through different case studies, and the advantages of the algorithms are quantified both for full and for partial observability solutions. Due to the route sets found by our approach, we are able to find full observability solutions characterized by a small number of sensors, while yet being efficient also in terms of partial observability. We perform validation tests on both small and real-life sized network instances. © 2017 Elsevier Lt