Search CORE

5 research outputs found

Distributed Triangle Counting in the Graphulo Matrix Math Library

Author: Hutchison Dylan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/09/2017
Field of study

Triangle counting is a key algorithm for large graph analysis. The Graphulo library provides a framework for implementing graph algorithms on the Apache Accumulo distributed database. In this work we adapt two algorithms for counting triangles, one that uses the adjacency matrix and another that also uses the incidence matrix, to the Graphulo library for server-side processing inside Accumulo. Cloud-based experiments show a similar performance profile for these different approaches on the family of power law Graph500 graphs, for which data skew increasingly bottlenecks. These results motivate the design of skew-aware hybrid algorithms that we propose for future work.Comment: Honorable mention in the 2017 IEEE HPEC's Graph Challeng

arXiv.org e-Print Archive

Crossref

GraphChallenge.org: Raising the Bar on Graph Analytic Performance

Author: Gadepally Vijay
Hurley Michael
Jones Michael
Kao Edward
Kepner Jeremy
Mohindra Sanjeev
Monticciolo Paul
Reuther Albert
Samsi Siddharth
Smith Steven
Song William
Staheli Diane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2018
Field of study

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of pre-parsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations. The submissions highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization. These submissions produced many comparable performance measurements that can be used for assessing the current state of the art of the field. There were numerous submissions that implemented the triangle counting challenge and resulted in over 350 distinct measurements. Analysis of these submissions show that their execution time is a strong function of the number of edges in the graph,

N_e

, and is typically proportional to

N_e^{4/3}

for large values of

N_e

. Combining the model fits of the submissions presents a picture of the current state of the art of graph analysis, which is typically

10^8

edges processed per second for graphs with

10^8

edges. These results are

30

times faster than serial implementations commonly used by many graph analysts and underscore the importance of making these performance benefits available to the broader community. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs.Comment: 7 pages, 6 figures; submitted to IEEE HPEC Graph Challenge. arXiv admin note: text overlap with arXiv:1708.0686

arXiv.org e-Print Archive

Crossref

On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices

Author: Kepner Jeremy
La Fond Timothy
Pearce Roger
Sanders Geoffrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2018
Field of study

Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators used in benchmarking are somewhat lacking in this respect due to their randomness: the output of a desired graph analytic can only be compared to expected values and not exact ground truth. Nonstochastic Kronecker product graphs meet these design criteria for several graph analytics. Here we show that many flavors of triangle participation can be cheaply calculated while generating a Kronecker product graph. Given two medium-sized scale-free graphs with adjacency matrices

A

and

B

, their Kronecker product graph has adjacency matrix

C = A \otimes B

. Such graphs are highly compressible:

|{\cal E}|

edges are represented in

{\cal O}(|{\cal E}|^{1/2})

memory and can be built in a distributed setting from small data structures, making them easy to share in compressed form. Many interesting graph calculations have worst-case complexity bounds

{\cal O}(|{\cal E}|^p)

and often these are reduced to

{\cal O}(|{\cal E}|^{p/2})

for Kronecker product graphs, when a Kronecker formula can be derived yielding the sought calculation on

C

in terms of related calculations on

A

and

B

. We focus on deriving formulas for triangle participation at vertices,

{\bf t}_C

, a vector storing the number of triangles that every vertex is involved in, and triangle participation at edges,

\Delta_C

, a sparse matrix storing the number of triangles at every edge.Comment: 10 pages, 7 figures, IEEE IPDPS Graph Algorithms Building Block

arXiv.org e-Print Archive

Crossref

A Systematic Survey of General Sparse Matrix-Matrix Multiplication

Author: Gao Jianhua
Ji Weixing
Tan Zhaonian
Zhao Yueyan
Publication venue
Publication date: 25/02/2020
Field of study

SpGEMM (General Sparse Matrix-Matrix Multiplication) has attracted much attention from researchers in fields of multigrid methods and graph analysis. Many optimization techniques have been developed for certain application fields and computing architecture over the decades. The objective of this paper is to provide a structured and comprehensive overview of the research on SpGEMM. Existing optimization techniques have been grouped into different categories based on their target problems and architectures. Covered topics include SpGEMM applications, size prediction of result matrix, matrix partitioning and load balancing, result accumulating, and target architecture-oriented optimization. The rationales of different algorithms in each category are analyzed, and a wide range of SpGEMM algorithms are summarized. This survey sufficiently reveals the latest progress and research status of SpGEMM optimization from 1977 to 2019. More specifically, an experimentally comparative study of existing implementations on CPU and GPU is presented. Based on our findings, we highlight future research directions and how future studies can leverage our findings to encourage better design and implementation.Comment: 19 pages, 11 figures, 2 tables, 4 algorithm

arXiv.org e-Print Archive

High-Performance and Power-Aware Graph Processing on GPUs

Author: Busato Federico
Publication venue
Publication date: 01/01/2018
Field of study

Graphs are a common representation in many problem domains, including engineering, finance, medicine, and scientific applications. Different problems map to very large graphs, often involving millions of vertices. Even though very efficient sequential implementations of graph algorithms exist, they become impractical when applied on such actual very large graphs. On the other hand, graphics processing units (GPUs) have become widespread architectures as they provide massive parallelism at low cost. Parallel execution on GPUs may achieve speedup up to three orders of magnitude with respect to the sequential counterparts. Nevertheless, accelerating efficient and optimized sequential algorithms and porting (i.e., parallelizing) their implementation to such many-core architectures is a very challenging task. The task is made even harder since energy and power consumption are becoming constraints in addition, or in same case as an alternative, to performance. This work aims at developing a platform that provides (I) a library of parallel, efficient, and tunable implementations of the most important graph algorithms for GPUs, and (II) an advanced profiling model to analyze both performance and power consumption of the algorithm implementations. The platform goal is twofold. Through the library, it aims at saving developing effort in the parallelization task through a primitive-based approach. Through the profiling framework, it aims at customizing such primitives by considering both the architectural details and the target efficiency metrics (i.e., performance or power)

Catalogo dei prodotti della ricerca