Search CORE

19,012 research outputs found

Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

Author: Azad Ariful
Ballard Grey
Buluc Aydin
Demmel James
Grigori Laura
Schwartz Oded
Toledo Sivan
Williams Samuel
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first ever implementation of the 3D SpGEMM formulation that also exploits multiple (intra-node and inter-node) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

eScholarship - University of California

Hal-Diderot

The Parallelism Motifs of Genomic Data Analysis

Author: Awan Muaaz
Azad Ariful
Brock Benjamin
Buluc Aydin
Egan Rob
Ekanayake Saliya
Ellis Marquita
Georganas Evangelos
Guidi Giulia
Hofmeyr Steven
Oliker Leonid
Selvitopi Oguz
Teodoropol Cristina
Yelick Katherine
Publication venue: 'The Royal Society'
Publication date: 20/01/2020
Field of study

Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing

arXiv.org e-Print Archive

eScholarship - University of California

Prospects and limitations of full-text index structures in genome analysis

Author: Dawyndt Peter
De Baets Bernard
Fack Veerle
Vyverman Michaël
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

Ghent University Academic Bibliography

PubMed Central

Faster Geometric Algorithms via Dynamic Determinant Computation

Author: Abbott
Avis
Avrachenkov
Bareiss
Bartlett
Barvinok
Basu
Berkowitz
Bird
Boehm
Boissonnat
Brönnimann
Brönnimann
Brönnimann
Bunch
Büeler
Büeler
CGAL
Chand
Clarkson
Clarkson
Clarkson
Conway
Coppersmith
Cox
Dumas
Dumas
Edelsbrunner
Emiris
Emiris
Fisikopoulos
Fukuda
Garling
Gawrilow
Gelfand
Guennebaud
Harville
Hornus
Iliopoulos
Kaltofen
Kaltofen
Kettner
Krattenthaler
Le Gall
Luis Peñaranda
Mahajan
Poole
Press
Rambau
Robinson
Rote
Sankowski
Seidel
Sherman
Urbańska
Villard
Vissarion Fisikopoulos
Yap
Ziegler
Publication venue: 'Elsevier BV'
Publication date: 12/01/2016
Field of study

The computation of determinants or their signs is the core procedure in many important geometric algorithms, such as convex hull, volume and point location. As the dimension of the computation space grows, a higher percentage of the total computation time is consumed by these computations. In this paper we study the sequences of determinants that appear in geometric algorithms. The computation of a single determinant is accelerated by using the information from the previous computations in that sequence. We propose two dynamic determinant algorithms with quadratic arithmetic complexity when employed in convex hull and volume computations, and with linear arithmetic complexity when used in point location problems. We implement the proposed algorithms and perform an extensive experimental analysis. On one hand, our analysis serves as a performance study of state-of-the-art determinant algorithms and implementations. On the other hand, we demonstrate the supremacy of our methods over state-of-the-art implementations of determinant and geometric algorithms. Our experimental results include a 20 and 78 times speed-up in volume and point location computations in dimension 6 and 11 respectively.Comment: 29 pages, 8 figures, 3 table

arXiv.org e-Print Archive

Crossref

DI-fusion

Deterministic Communication in Radio Networks

Author: Czumaj Artur
Davies Peter
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2018
Field of study

In this paper we improve the deterministic complexity of two fundamental communication primitives in the classical model of ad-hoc radio networks with unknown topology: broadcasting and wake-up. We consider an unknown radio network, in which all nodes have no prior knowledge about network topology, and know only the size of the network

n

, the maximum in-degree of any node

\Delta

, and the eccentricity of the network

D

. For such networks, we first give an algorithm for wake-up, based on the existence of small universal synchronizers. This algorithm runs in

O(\frac{\min\{n, D \Delta\} \log n \log \Delta}{\log\log \Delta})

time, the fastest known in both directed and undirected networks, improving over the previous best

O(n \log^2n)

-time result across all ranges of parameters, but particularly when maximum in-degree is small. Next, we introduce a new combinatorial framework of block synchronizers and prove the existence of such objects of low size. Using this framework, we design a new deterministic algorithm for the fundamental problem of broadcasting, running in

O(n \log D \log\log\frac{D \Delta}{n})

time. This is the fastest known algorithm for the problem in directed networks, improving upon the

O(n \log n \log \log n)

-time algorithm of De Marco (2010) and the

O(n \log^2 D)

-time algorithm due to Czumaj and Rytter (2003). It is also the first to come within a log-logarithmic factor of the

\Omega(n \log D)

lower bound due to Clementi et al.\ (2003). Our results also have direct implications on the fastest \emph{deterministic leader election} and \emph{clock synchronization} algorithms in both directed and undirected radio networks, tasks which are commonly used as building blocks for more complex procedures

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems

Author: Dutta Sourav
Lauri Juho
Publication venue
Publication date: 22/02/2019
Field of study

We propose a simple, powerful, and flexible machine learning framework for (i) reducing the search space of computationally difficult enumeration variants of subset problems and (ii) augmenting existing state-of-the-art solvers with informative cues arising from the input distribution. We instantiate our framework for the problem of listing all maximum cliques in a graph, a central problem in network analysis, data mining, and computational biology. We demonstrate the practicality of our approach on real-world networks with millions of vertices and edges by not only retaining all optimal solutions, but also aggressively pruning the input instance size resulting in several fold speedups of state-of-the-art algorithms. Finally, we explore the limits of scalability and robustness of our proposed framework, suggesting that supervised learning is viable for tackling NP-hard problems in practice.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications