Search CORE

6 research outputs found

Fast M\"obius and Zeta Transforms

Author: Pegolotti Tommaso
Püschel Markus
Seifert Bastian
Publication venue
Publication date: 24/11/2022
Field of study

M\"obius inversion of functions on partially ordered sets (posets)

\mathcal{P}

is a classical tool in combinatorics. For finite posets it consists of two, mutually inverse, linear transformations called zeta and M\"obius transform, respectively. In this paper we provide novel fast algorithms for both that require

O(nk)

time and space, where

n = |\mathcal{P}|

and

k

is the width (length of longest antichain) of

\mathcal{P}

, compared to

O(n^2)

for a direct computation. Our approach assumes that

\mathcal{P}

is given as directed acyclic graph (DAG)

(\mathcal{E}, \mathcal{P})

. The algorithms are then constructed using a chain decomposition for a one time cost of

O(|\mathcal{E}| + |\mathcal{E}_\text{red}| k)

, where

\mathcal{E}_\text{red}

is the number of edges in the DAG's transitive reduction. We show benchmarks with implementations of all algorithms including parallelized versions. The results show that our algorithms enable M\"obius inversion on posets with millions of nodes in seconds if the defining DAGs are sufficiently sparse.Comment: 16 pages, 7 figures, submitted for revie

arXiv.org e-Print Archive

Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver

Author: B. Smith
M.M. Wolf
V.E. Henson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication

Author: Alappat Christie Louis
Basermann Achim
Bishop Alan R.
Fehske Holger
Hager Georg
Schenk Olaf
Thies Jonas
Wellein Gerhard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/07/2019
Field of study

The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building block for many numerical linear algebra kernel operations or graph traversal applications. Parallelizing SymmSpMV on today's multicore platforms with up to 100 cores is difficult due to the need to manage conflicting updates on the result vector. Coloring approaches can be used to solve this problem without data duplication, but existing coloring algorithms do not take load balancing and deep memory hierarchies into account, hampering scalability and full-chip performance. In this work, we propose the recursive algebraic coloring engine (RACE), a novel coloring algorithm and open-source library implementation, which eliminates the shortcomings of previous coloring methods in terms of hardware efficiency and parallelization overhead. We describe the level construction, distance-k coloring, and load balancing steps in RACE, use it to parallelize SymmSpMV, and compare its performance on 31 sparse matrices with other state-of-the-art coloring techniques and Intel MKL on two modern multicore processors. RACE outperforms all other approaches substantially and behaves in accordance with the Roofline model. Outliers are discussed and analyzed in detail. While we focus on SymmSpMV in this paper, our algorithm and software is applicable to any sparse matrix operation with data dependencies that can be resolved by distance-k coloring

Institute of Transport Research:Publications

arXiv.org e-Print Archive

Doctor of Philosophy

Author: Venkat Anand
Publication venue: University of Utah
Publication date: 01/01/2016
Field of study

dissertationSparse matrix codes are found in numerous applications ranging from iterative numerical solvers to graph analytics. Achieving high performance on these codes has however been a significant challenge, mainly due to array access indirection, for example, of the form A[B[i]]. Indirect accesses make precise dependence analysis impossible at compile-time, and hence prevent many parallelizing and locality optimizing transformations from being applied. The expert user relies on manually written libraries to tailor the sparse code and data representations best suited to the target architecture from a general sparse matrix representation. However libraries have limited composability, address very specific optimization strategies, and have to be rewritten as new architectures emerge. In this dissertation, we explore the use of the inspector/executor methodology to accomplish the code and data transformations to tailor high performance sparse matrix representations. We devise and embed abstractions for such inspector/executor transformations within a compiler framework so that they can be composed with a rich set of existing polyhedral compiler transformations to derive complex transformation sequences for high performance. We demonstrate the automatic generation of inspector/executor code, which orchestrates code and data transformations to derive high performance representations for the Sparse Matrix Vector Multiply kernel in particular. We also show how the same transformations may be integrated into sparse matrix and graph applications such as Sparse Matrix Matrix Multiply and Stochastic Gradient Descent, respectively. The specific constraints of these applications, such as problem size and dependence structure, necessitate unique sparse matrix representations that can be realized using our transformations. Computations such as Gauss Seidel, with loop carried dependences at the outer most loop necessitate different strategies for high performance. Specifically, we organize the computation into level sets or wavefronts of irregular size, such that iterations of a wavefront may be scheduled in parallel but different wavefronts have to be synchronized. We demonstrate automatic code generation of high performance inspectors that do explicit dependence testing and level set construction at runtime, as well as high performance executors, which are the actual parallelized computations. For the above sparse matrix applications, we automatically generate inspector/executor code comparable in performance to manually tuned libraries

The University of Utah: J. Willard Marriott Digital Library