19,012 research outputs found
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many
high-performance graph algorithms as well as for some linear solvers, such as
algebraic multigrid. The scaling of existing parallel implementations of SpGEMM
is heavily bound by communication. Even though 3D (or 2.5D) algorithms have
been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi
matrices, those algorithms had not been implemented in practice and their
complexities had not been analyzed for the general case. In this work, we
present the first ever implementation of the 3D SpGEMM formulation that also
exploits multiple (intra-node and inter-node) levels of parallelism, achieving
significant speedups over the state-of-the-art publicly available codes at all
levels of concurrencies. We extensively evaluate our implementation and
identify bottlenecks that should be subject to further research
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
Faster Geometric Algorithms via Dynamic Determinant Computation
The computation of determinants or their signs is the core procedure in many
important geometric algorithms, such as convex hull, volume and point location.
As the dimension of the computation space grows, a higher percentage of the
total computation time is consumed by these computations. In this paper we
study the sequences of determinants that appear in geometric algorithms. The
computation of a single determinant is accelerated by using the information
from the previous computations in that sequence.
We propose two dynamic determinant algorithms with quadratic arithmetic
complexity when employed in convex hull and volume computations, and with
linear arithmetic complexity when used in point location problems. We implement
the proposed algorithms and perform an extensive experimental analysis. On one
hand, our analysis serves as a performance study of state-of-the-art
determinant algorithms and implementations. On the other hand, we demonstrate
the supremacy of our methods over state-of-the-art implementations of
determinant and geometric algorithms. Our experimental results include a 20 and
78 times speed-up in volume and point location computations in dimension 6 and
11 respectively.Comment: 29 pages, 8 figures, 3 table
Deterministic Communication in Radio Networks
In this paper we improve the deterministic complexity of two fundamental
communication primitives in the classical model of ad-hoc radio networks with
unknown topology: broadcasting and wake-up. We consider an unknown radio
network, in which all nodes have no prior knowledge about network topology, and
know only the size of the network , the maximum in-degree of any node
, and the eccentricity of the network .
For such networks, we first give an algorithm for wake-up, based on the
existence of small universal synchronizers. This algorithm runs in
time, the
fastest known in both directed and undirected networks, improving over the
previous best -time result across all ranges of parameters, but
particularly when maximum in-degree is small.
Next, we introduce a new combinatorial framework of block synchronizers and
prove the existence of such objects of low size. Using this framework, we
design a new deterministic algorithm for the fundamental problem of
broadcasting, running in time. This is
the fastest known algorithm for the problem in directed networks, improving
upon the -time algorithm of De Marco (2010) and the
-time algorithm due to Czumaj and Rytter (2003). It is also the
first to come within a log-logarithmic factor of the lower
bound due to Clementi et al.\ (2003).
Our results also have direct implications on the fastest \emph{deterministic
leader election} and \emph{clock synchronization} algorithms in both directed
and undirected radio networks, tasks which are commonly used as building blocks
for more complex procedures
Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems
We propose a simple, powerful, and flexible machine learning framework for
(i) reducing the search space of computationally difficult enumeration variants
of subset problems and (ii) augmenting existing state-of-the-art solvers with
informative cues arising from the input distribution. We instantiate our
framework for the problem of listing all maximum cliques in a graph, a central
problem in network analysis, data mining, and computational biology. We
demonstrate the practicality of our approach on real-world networks with
millions of vertices and edges by not only retaining all optimal solutions, but
also aggressively pruning the input instance size resulting in several fold
speedups of state-of-the-art algorithms. Finally, we explore the limits of
scalability and robustness of our proposed framework, suggesting that
supervised learning is viable for tackling NP-hard problems in practice.Comment: AAAI 201
- …