Search CORE

519 research outputs found

Scalable Breadth-First Search on a GPU Cluster

Author: Owens John D.
Pan Yuechao
Pearce Roger
Publication venue
Publication date: 13/03/2018
Field of study

On a GPU cluster, the ratio of high computing power to communication bandwidth makes scaling breadth-first search (BFS) on a scale-free graph extremely challenging. By separating high and low out-degree vertices, we present an implementation with scalable computation and a model for scalable communication for BFS and direction-optimized BFS. Our communication model uses global reduction for high-degree vertices, and point-to-point transmission for low-degree vertices. Leveraging the characteristics of degree separation, we reduce the graph size to one third of the conventional edge list representation. With several other optimizations, we observe linear weak scaling as we increase the number of GPUs, and achieve 259.8 GTEPS on a scale-33 Graph500 RMAT graph with 124 GPUs on the latest CORAL early access system.Comment: 12 pages, 13 figures. To appear at IPDPS 201

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Data compression for climate data

Author: Kuhn Michael
Kunkel Julian M.
Ludwig Thomas
Publication venue: 'FSAEIHE South Ural State University (National Research University)'
Publication date: 01/01/2016
Field of study

The different rates of increase for computational power and storage capabilities of supercomputers turn data storage into a technical and economical problem. Because storage capabilities are lagging behind, investments and operational costs for storage systems have increased to keep up with the supercomputers' I/O requirements. One promising approach is to reduce the amount of data that is stored. In this paper, we take a look at the impact of compression on performance and costs of high performance systems. To this end, we analyze the applicability of compression on all layers of the I/O stack, that is, main memory, network and storage. Based on the Mistral system of the German Climate Computing Center (Deutsches Klimarechenzentrum, DKRZ), we illustrate potential performance improvements and cost savings. Making use of compression on a large scale can decrease investments and operational costs by 50% without negatively impacting performance. Additionally, we present ongoing work for supporting enhanced adaptive compression in the parallel distributed file system Lustre and application-specific compression

Central Archive at the University of Reading

GekkoFS: A temporary burst buffer file system for HPC applications

Author: Brinkmann Andre
Cortés Toni
Miranda Bueno Alberto
Moti Nafiseh
Nou Ramon
Süb Tim
Tacke Markus
Tocci Tommaso
Vef Marc-André
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Many scientific fields increasingly use high-performance computing (HPC) to process and analyze massive amounts of experimental data while storage systems in today’s HPC environments have to cope with new access patterns. These patterns include many metadata operations, small I/O requests, or randomized file I/O, while general-purpose parallel file systems have been optimized for sequential shared access to large files. Burst buffer file systems create a separate file system that applications can use to store temporary data. They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel file system without interfering with it. However, burst buffer file systems typically offer many features that a scientific application, running in isolation for a limited amount of time, does not require. We present GekkoFS, a temporary, highly-scalable file system which has been specifically optimized for the aforementioned use cases. GekkoFS provides relaxed POSIX semantics which only offers features which are actually required by most (not all) applications. GekkoFS is, therefore, able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of common parallel file systems.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC