Search CORE

13 research outputs found

GekkoFS: A temporary distributed file system for HPC applications

Author: Brinkmann Andre
Cortés Toni
Miranda Alberto
Moti Nafiseh
Nou Castell Ramon
Süb Tim
Tocci Tommaso
Vef Marc-André
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

We present GekkoFS, a temporary, highly-scalable burst buffer file system which has been specifically optimized for new access patterns of data-intensive High-Performance Computing (HPC) applications. The file system provides relaxed POSIX semantics, only offering features which are actually required by most (not all) applications. It is able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of general-purpose parallel file systems.The work has been funded by the German Research Foundation (DFG) through the ADA-FS project as part of the Priority Programme 1648. It is also supported by the Spanish Ministry of Science and Innovation (TIN2015–65316), the Generalitat de Catalunya (2014–SGR–1051), as well as the European Union’s Horizon 2020 Research and Innovation Programme (NEXTGenIO, 671951) and the European Comission’s BigStorage project (H2020-MSCA-ITN-2014-642963). This research was conducted using the supercomputer MOGON II and services offered by the Johannes Gutenberg University Mainz.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

GekkoFS: A temporary burst buffer file system for HPC applications

Author: Brinkmann Andre
Cortés Toni
Miranda Bueno Alberto
Moti Nafiseh
Nou Ramon
Süb Tim
Tacke Markus
Tocci Tommaso
Vef Marc-André
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Many scientific fields increasingly use high-performance computing (HPC) to process and analyze massive amounts of experimental data while storage systems in today’s HPC environments have to cope with new access patterns. These patterns include many metadata operations, small I/O requests, or randomized file I/O, while general-purpose parallel file systems have been optimized for sequential shared access to large files. Burst buffer file systems create a separate file system that applications can use to store temporary data. They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel file system without interfering with it. However, burst buffer file systems typically offer many features that a scientific application, running in isolation for a limited amount of time, does not require. We present GekkoFS, a temporary, highly-scalable file system which has been specifically optimized for the aforementioned use cases. GekkoFS provides relaxed POSIX semantics which only offers features which are actually required by most (not all) applications. GekkoFS is, therefore, able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of common parallel file systems.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Arbitration policies for on-demand user-level I/O forwarding on HPC platforms

Author: Bez Jean Luca
Cortés Toni
Miranda Alberto
Navaux Philippe O.A.
Nou Castell Ramon
Zanon Boito Francieli
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

I/O forwarding is a well-established and widely-adopted technique in HPC to reduce contention in the access to storage servers and transparently improve I/O performance. Rather than having applications directly accessing the shared parallel file system, the forwarding technique defines a set of I/O nodes responsible for receiving application requests and forwarding them to the file system, thus reshaping the flow of requests. The typical approach is to statically assign I/O nodes to applications depending on the number of compute nodes they use, which is not always necessarily related to their I/O requirements. Thus, this approach leads to inefficient usage of these resources. This paper investigates arbitration policies based on the applications I/O demands, represented by their access patterns. We propose a policy based on the Multiple-Choice Knapsack problem that seeks to maximize global bandwidth by giving more I/O nodes to applications that will benefit the most. Furthermore, we propose a user-level I/O forwarding solution as an on-demand service capable of applying different allocation policies at runtime for machines where this layer is not present. We demonstrate our approach's applicability through extensive experimentation and show it can transparently improve global I/O bandwidth by up to 85% in a live setup compared to the default static policy.This study was financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Supenor - Brasil (CAPES) - Finance Code 001. It has also received support from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Brazil. It is also partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grants PID2019-107255GB; and the Generalitat de Catalunya under contract 2014-SGR-1051. The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the Barcelona Supercomputing Center. Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Arbitration Policies for On-Demand User-Level I/O Forwarding on HPC Platforms

Author: Bez Jean Luca
Cortes Toni
Miranda Alberto
Navaux Philippe
Nou Ramon
Zanon Boito Francieli
Publication venue: HAL CCSD
Publication date: 17/05/2021
Field of study

International audienceI/O forwarding is a well-established and widelyadopted technique in HPC to reduce contention in the access to storage servers and transparently improve I/O performance. Rather than having applications directly accessing the shared parallel file system, the forwarding technique defines a set of I/O nodes responsible for receiving application requests and forwarding them to the file system, thus reshaping the flow of requests. The typical approach is to statically assign I/O nodes to applications depending on the number of compute nodes they use, which is not always necessarily related to their I/O requirements. Thus, this approach leads to inefficient usage of these resources. This paper investigates arbitration policies based on the applications I/O demands, represented by their access patterns. We propose a policy based on the Multiple-Choice Knapsack problem that seeks to maximize global bandwidth by giving more I/O nodes to applications that will benefit the most. Furthermore, we propose a userlevel I/O forwarding solution as an on-demand service capable of applying different allocation policies at runtime for machines where this layer is not present. We demonstrate our approach's applicability through extensive experimentation and show it can transparently improve global I/O bandwidth by up to 85% in a live setup compared to the default static policy

INRIA a CCSD electronic archive server

Software for Exascale Computing - SPPEXA 2016-2019

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

OAPEN Library

Evaluación de rendimiento del sistema de ficheros paralelo Expand Ad-Hoc en MareNostrum 4

Author: Alejandro Calderon-Mateos
Diego Camarmas-Alonso
Felix Garcia-Carballeira
Jesus Carretero
Publication venue
Publication date: 26/09/2023
Field of study

Durante los últimos años las aplicaciones utilizadas en el campo de la ciencia están evolucionando hacia el análisis masivo de datos a través de workflows debido al crecimiento de áreas como la Inteligencia Artificial y el big data. Sin embargo, el mayor cuello de botella cuando se ejecutan este tipo de aplicaciones se encuentra en las operaciones de E/S. Para tratar de solventar este problema se está desarrollando el sistema de ficheros paralelo Expand Ad-Hoc. Permite crear particiones virtuales ad-hoc para incrementar el rendimiento de E/S en entornos de supercomputación. El objetivo de este trabajo es presentar una evaluación de este sistema de ficheros sobre el supercomputador MareNostrum 4. En la evaluación de Expand Ad-Hoc llevada a cabo en MareNostrum 4, se ha podido comprobar que su rendimiento y escalabilidad es globalmente superior al del sistema de ficheros paralelo GPFS

ZENODO

Exploring Scheduling for On-demand File Systems and Data Management within HPC Environments

Author: Soysal Mehmet
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 15/03/2021
Field of study

KITopen

Exploring Scheduling for On-demand File Systems and Data Management within HPC Environments

Author: Soysal Mehmet
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 15/03/2021
Field of study

KITopen

A Framework for Large Scale Particle Filters Validated with Data Assimilation for Weather Simulation

Author: Bautista Gomez Leonardo
Friedemann Sebastian
Keller Kai
Lu Yen-Sen
Raffin Bruno
Publication venue: HAL CCSD
Publication date: 06/01/2023
Field of study

Particle filters are a group of algorithms to solve inverse problems through statistical Bayesian methods when the model does not comply with the linear and Gaussian hypothesis. Particle filters are used in domains like data assimilation, probabilistic programming, neural networkoptimization, localization and navigation. Particle filters estimate the probabilitydistribution of model states by running a large number of model instances, the so called particles. The ability to handle a very large number of particles is critical for high dimensional models.This paper proposes a novel paradigm to run very large ensembles of parallel model instances on supercomputers. The approach combines an elastic and fault tolerant runner/server model minimizing data movementswhile enabling dynamic load balancing. Particle weights are computed locally on each runner andtransmitted when available to a server that normalizes them, resamples new particles based on their weight, and redistributes dynamically the work torunners to react to load imbalance. Our approach relies on a an asynchronously manageddistributed particle cache permitting particles to move from one runner to another inthe background while particle propagation goes on. This also enables the number ofrunners to vary during the execution either in reaction to failures and restarts, orto adapt to changing resource availability dictated by external decision processes.The approach is experimented with the Weather Research and Forecasting (WRF) model, toassess its performance for probabilistic weather forecasting. Up to 2555particles on 20442 compute cores are used to assimilate cloud cover observations into short--range weather forecasts over Europe

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server