2,234 research outputs found
A Fast and Scalable Graph Coloring Algorithm for Multi-core and Many-core Architectures
Irregular computations on unstructured data are an important class of
problems for parallel programming. Graph coloring is often an important
preprocessing step, e.g. as a way to perform dependency analysis for safe
parallel execution. The total run time of a coloring algorithm adds to the
overall parallel overhead of the application whereas the number of colors used
determines the amount of exposed parallelism. A fast and scalable coloring
algorithm using as few colors as possible is vital for the overall parallel
performance and scalability of many irregular applications that depend upon
runtime dependency analysis.
Catalyurek et al. have proposed a graph coloring algorithm which relies on
speculative, local assignment of colors. In this paper we present an improved
version which runs even more optimistically with less thread synchronization
and reduced number of conflicts compared to Catalyurek et al.'s algorithm. We
show that the new technique scales better on multi-core and many-core systems
and performs up to 1.5x faster than its predecessor on graphs with high-degree
vertices, while keeping the number of colors at the same near-optimal levels.Comment: To appear in the proceedings of Euro Par 201
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many
high-performance graph algorithms as well as for some linear solvers, such as
algebraic multigrid. The scaling of existing parallel implementations of SpGEMM
is heavily bound by communication. Even though 3D (or 2.5D) algorithms have
been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi
matrices, those algorithms had not been implemented in practice and their
complexities had not been analyzed for the general case. In this work, we
present the first ever implementation of the 3D SpGEMM formulation that also
exploits multiple (intra-node and inter-node) levels of parallelism, achieving
significant speedups over the state-of-the-art publicly available codes at all
levels of concurrencies. We extensively evaluate our implementation and
identify bottlenecks that should be subject to further research
Optimal Collision/Conflict-free Distance-2 Coloring in Synchronous Broadcast/Receive Tree Networks
This article is on message-passing systems where communication is (a)
synchronous and (b) based on the "broadcast/receive" pair of communication
operations. "Synchronous" means that time is discrete and appears as a sequence
of time slots (or rounds) such that each message is received in the very same
round in which it is sent. "Broadcast/receive" means that during a round a
process can either broadcast a message to its neighbors or receive a message
from one of them. In such a communication model, no two neighbors of the same
process, nor a process and any of its neighbors, must be allowed to broadcast
during the same time slot (thereby preventing message collisions in the first
case, and message conflicts in the second case). From a graph theory point of
view, the allocation of slots to processes is know as the distance-2 coloring
problem: a color must be associated with each process (defining the time slots
in which it will be allowed to broadcast) in such a way that any two processes
at distance at most 2 obtain different colors, while the total number of colors
is "as small as possible". The paper presents a parallel message-passing
distance-2 coloring algorithm suited to trees, whose roots are dynamically
defined. This algorithm, which is itself collision-free and conflict-free, uses
colors where is the maximal degree of the graph (hence
the algorithm is color-optimal). It does not require all processes to have
different initial identities, and its time complexity is , where d
is the depth of the tree. As far as we know, this is the first distributed
distance-2 coloring algorithm designed for the broadcast/receive round-based
communication model, which owns all the previous properties.Comment: 19 pages including one appendix. One Figur
- …