11,462 research outputs found
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies
We explore the trade-offs of performing linear algebra using Apache Spark,
compared to traditional C and MPI implementations on HPC platforms. Spark is
designed for data analytics on cluster computing platforms with access to local
disks and is optimized for data-parallel tasks. We examine three widely-used
and important matrix factorizations: NMF (for physical plausability), PCA (for
its ubiquity) and CX (for data interpretability). We apply these methods to
TB-sized problems in particle physics, climate modeling and bioimaging. The
data matrices are tall-and-skinny which enable the algorithms to map
conveniently into Spark's data-parallel model. We perform scaling experiments
on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide
tuning guidance to obtain high performance
A randomised primal-dual algorithm for distributed radio-interferometric imaging
Next generation radio telescopes, like the Square Kilometre Array, will
acquire an unprecedented amount of data for radio astronomy. The development of
fast, parallelisable or distributed algorithms for handling such large-scale
data sets is of prime importance. Motivated by this, we investigate herein a
convex optimisation algorithmic structure, based on primal-dual
forward-backward iterations, for solving the radio interferometric imaging
problem. It can encompass any convex prior of interest. It allows for the
distributed processing of the measured data and introduces further flexibility
by employing a probabilistic approach for the selection of the data blocks used
at a given iteration. We study the reconstruction performance with respect to
the data distribution and we propose the use of nonuniform probabilities for
the randomised updates. Our simulations show the feasibility of the
randomisation given a limited computing infrastructure as well as important
computational advantages when compared to state-of-the-art algorithmic
structures.Comment: 5 pages, 3 figures, Proceedings of the European Signal Processing
Conference (EUSIPCO) 2016, Related journal publication available at
https://arxiv.org/abs/1601.0402
- …