Search CORE

54 research outputs found

Development of an oceanographic application in HPC

Author: Basciano Davide
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2016
Field of study

High Performance Computing (HPC) is used for running advanced application programs efficiently, reliably, and quickly. In earlier decades, performance analysis of HPC applications was evaluated based on speed, scalability of threads, memory hierarchy. Now, it is essential to consider the energy or the power consumed by the system while executing an application. In fact, the High Power Consumption (HPC) is one of biggest problems for the High Performance Computing (HPC) community and one of the major obstacles for exascale systems design. The new generations of HPC systems intend to achieve exaflop performances and will demand even more energy to processing and cooling. Nowadays, the growth of HPC systems is limited by energy issues Recently, many research centers have focused the attention on doing an automatic tuning of HPC applications which require a wide study of HPC applications in terms of power efficiency. In this context, this paper aims to propose the study of an oceanographic application, named OceanVar, that implements Domain Decomposition based 4D Variational model (DD-4DVar), one of the most commonly used HPC applications, going to evaluate not only the classic aspects of performance but also aspects related to power efficiency in different case of studies. These work were realized at Bsc (Barcelona Supercomputing Center), Spain within the Mont-Blanc project, performing the test first on HCA server with Intel technology and then on a mini-cluster Thunder with ARM technology. In this work of thesis it was initially explained the concept of assimilation date, the context in which it is developed, and a brief description of the mathematical model 4DVAR. After this problem’s close examination, it was performed a porting from Matlab description of the problem of data-assimilation to its sequential version in C language. Secondly, after identifying the most onerous computational kernels in order of time, it has been developed a parallel version of the application with a parallel multiprocessor programming style, using the MPI (Message Passing Interface) protocol. The experiments results, in terms of performance, have shown that, in the case of running on HCA server, an Intel architecture, values of efficiency of the two most onerous functions obtained, growing the number of process, are approximately equal to 80%. In the case of running on ARM architecture, specifically on Thunder mini-cluster, instead, the trend obtained is labeled as "SuperLinear Speedup" and, in our case, it can be explained by a more efficient use of resources (cache memory access) compared with the sequential case. In the second part of this paper was presented an analysis of the some issues of this application that has impact in the energy efficiency. After a brief discussion about the energy consumption characteristics of the Thunder chip in technological landscape, through the use of a power consumption detector, the Yokogawa Power Meter, values of energy consumption of mini-cluster Thunder were evaluated in order to determine an overview on the power-to-solution of this application to use as the basic standard for successive analysis with other parallel styles. Finally, a comprehensive performance evaluation, targeted to estimate the goodness of MPI parallelization, is conducted using a suitable performance tool named Paraver, developed by BSC. Paraver is such a performance analysis and visualisation tool which can be used to analyse MPI, threaded or mixed mode programmes and represents the key to perform a parallel profiling and to optimise the code for High Performance Computing. A set of graphical representation of these statistics make it easy for a developer to identify performance problems. Some of the problems that can be easily identified are load imbalanced decompositions, excessive communication overheads and poor average floating operations per second achieved. Paraver can also report statistics based on hardware counters, which are provided by the underlying hardware. This project aimed to use Paraver configuration files to allow certain metrics to be analysed for this application. To explain in some way the performance trend obtained in the case of analysis on the mini-cluster Thunder, the tracks were extracted from various case of studies and the results achieved is what expected, that is a drastic drop of cache misses by the case ppn (process per node) = 1 to case ppn = 16. This in some way explains a more efficient use of cluster resources with an increase of the number of processes

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Resilience for large ensemble computations

Author: Keller Kai Rasmus
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2022
Field of study

With the increasing power of supercomputers, ever more detailed models of physical systems can be simulated, and ever larger problem sizes can be considered for any kind of numerical system. During the last twenty years the performance of the fastest clusters went from the teraFLOPS domain (ASCI RED: 2.3 teraFLOPS) to the pre-exaFLOPS domain (Fugaku: 442 petaFLOPS), and we will soon have the first supercomputer with a peak performance cracking the exaFLOPS (El Capitan: 1.5 exaFLOPS). Ensemble techniques experience a renaissance with the availability of those extreme scales. Especially recent techniques, such as particle filters, will benefit from it. Current ensemble methods in climate science, such as ensemble Kalman filters, exhibit a linear dependency between the problem size and the ensemble size, while particle filters show an exponential dependency. Nevertheless, with the prospect of massive computing power come challenges such as power consumption and fault-tolerance. The mean-time-between-failures shrinks with the number of components in the system, and it is expected to have failures every few hours at exascale. In this thesis, we explore and develop techniques to protect large ensemble computations from failures. We present novel approaches in differential checkpointing, elastic recovery, fully asynchronous checkpointing, and checkpoint compression. Furthermore, we design and implement a fault-tolerant particle filter with pre-emptive particle prefetching and caching. And finally, we design and implement a framework for the automatic validation and application of lossy compression in ensemble data assimilation. Altogether, we present five contributions in this thesis, where the first two improve state-of-the-art checkpointing techniques, and the last three address the resilience of ensemble computations. The contributions represent stand-alone fault-tolerance techniques, however, they can also be used to improve the properties of each other. For instance, we utilize elastic recovery (2nd contribution) for mitigating resiliency in an online ensemble data assimilation framework (3rd contribution), and we built our validation framework (5th contribution) on top of our particle filter implementation (4th contribution). We further demonstrate that our contributions improve resilience and performance with experiments on various architectures such as Intel, IBM, and ARM processors.Amb l’increment de les capacitats de còmput dels supercomputadors, es poden simular models de sistemes físics encara més detallats, i es poden resoldre problemes de més grandària en qualsevol tipus de sistema numèric. Durant els últims vint anys, el rendiment dels clústers més ràpids ha passat del domini dels teraFLOPS (ASCI RED: 2.3 teraFLOPS) al domini dels pre-exaFLOPS (Fugaku: 442 petaFLOPS), i aviat tindrem el primer supercomputador amb un rendiment màxim que sobrepassa els exaFLOPS (El Capitan: 1.5 exaFLOPS). Les tècniques d’ensemble experimenten un renaixement amb la disponibilitat d’aquestes escales tan extremes. Especialment les tècniques més noves, com els filtres de partícules, se¿n beneficiaran. Els mètodes d’ensemble actuals en climatologia, com els filtres d’ensemble de Kalman, exhibeixen una dependència lineal entre la mida del problema i la mida de l’ensemble, mentre que els filtres de partícules mostren una dependència exponencial. No obstant, juntament amb les oportunitats de poder computar massivament, apareixen desafiaments com l’alt consum energètic i la necessitat de tolerància a errors. El temps de mitjana entre errors es redueix amb el nombre de components del sistema, i s’espera que els errors s’esdevinguin cada poques hores a exaescala. En aquesta tesis, explorem i desenvolupem tècniques per protegir grans càlculs d’ensemble d’errors. Presentem noves tècniques en punts de control diferencials, recuperació elàstica, punts de control totalment asincrònics i compressió de punts de control. A més, dissenyem i implementem un filtre de partícules tolerant a errors amb captació i emmagatzematge en caché de partícules de manera preventiva. I finalment, dissenyem i implementem un marc per la validació automàtica i l’aplicació de compressió amb pèrdua en l’assimilació de dades d’ensemble. En total, en aquesta tesis presentem cinc contribucions, les dues primeres de les quals milloren les tècniques de punts de control més avançades, mentre que les tres restants aborden la resiliència dels càlculs d’ensemble. Les contribucions representen tècniques independents de tolerància a errors; no obstant, també es poden utilitzar per a millorar les propietats de cadascuna. Per exemple, utilitzem la recuperació elàstica (segona contribució) per a mitigar la resiliència en un marc d’assimilació de dades d’ensemble en línia (tercera contribució), i construïm el nostre marc de validació (cinquena contribució) sobre la nostra implementació del filtre de partícules (quarta contribució). A més, demostrem que les nostres contribucions milloren la resiliència i el rendiment amb experiments en diverses arquitectures, com processadors Intel, IBM i ARM.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Multilevel Algebraic Approach for Performance Analysis of Parallel Algorithms

Author: D'Amore Luisa
Laccetti Giuliano
Mele Valeria
Romano Diego
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 30/12/2019
Field of study

In order to solve a problem in parallel we need to undertake the fundamental step of splitting the computational tasks into parts, i.e. decomposing the problem solving. A whatever decomposition does not necessarily lead to a parallel algorithm with the highest performance. This topic is even more important when complex parallel algorithms must be developed for hybrid or heterogeneous architectures. We present an innovative approach which starts from a problem decomposition into parts (sub-problems). These parts will be regarded as elements of an algebraic structure and will be related to each other according to a suitably defined dependency relationship. The main outcome of such framework is to define a set of block matrices (dependency, decomposition, memory accesses and execution) which simply highlight fundamental characteristics of the corresponding algorithm, such as inherent parallelism and sources of overheads. We provide a mathematical formulation of this approach, and we perform a feasibility analysis for the performance of a parallel algorithm in terms of its time complexity and scalability. We compare our results with standard expressions of speed up, efficiency, overhead, and so on. Finally, we show how the multilevel structure of this framework eases the choice of the abstraction level (both for the problem decomposition and for the algorithm description) in order to determine the granularity of the tasks within the performance analysis. This feature is helpful to better understand the mapping of parallel algorithms on novel hybrid and heterogeneous architectures

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

CIRA annual report FY 2011/2012

Author
Publication venue: Colorado State University. Cooperative Institute for Research in the Atmosphere
Publication date: 01/01/2012
Field of study

Mountain Scholar (Digital Collections of Colorado and Wyoming)

CIRA annual report FY 2016/2017

Author
Publication venue: Colorado State University. Cooperative Institute for Research in the Atmosphere
Publication date: 01/01/2017
Field of study

Reporting period April 1, 2016-March 31, 2017

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Recommended from our members

Data assimilation methods and applications

Author: Alves O.
Buehner M.
Dee D.
Gelaro R.
Kalnay E.
Lorenc A.
Miyoshi T.
Saunders R.
Schiller A.
Van Leeuwen Peter Jan
Publication venue: World Meteorological Organization
Publication date: 01/01/2015
Field of study

Central Archive at the University of Reading

Recommended from our members

The WWRP Polar Prediction Project (PPP)

Author: Bauer Peter
Bromwich David H.
Day Jonathan J.
Doblas-Reyes Francisco
Fairall Cristopher W.
Goessling Helge F.
Gordon Neil
Holland Marika
Iverson Trond
Jung Thomas
Klebe Stefanie
Lemke Peter
Mills Brian
Nurmi Pertti
Perovich Donald K.
Reid Philip
Renfrew Ian
Smith Gregory
Svensson Gunilla
Tolstykh Mikhail
Publication venue: World Meteorological Organisation
Publication date: 01/06/2015
Field of study

Mission statement: “Promote cooperative international research enabling development of improved weather and environmental prediction services for the polar regions, on time scales from hours to seasonal”. Increased economic, transportation and research activities in polar regions are leading to more demands for sustained and improved availability of predictive weather and climate information to support decision-making. However, partly as a result of a strong emphasis of previous international efforts on lower and middle latitudes, many gaps in weather, sub-seasonal and seasonal forecasting in polar regions hamper reliable decision making in the Arctic, Antarctic and possibly the middle latitudes as well. In order to advance polar prediction capabilities, the WWRP Polar Prediction Project (PPP) has been established as one of three THORPEX (THe Observing System Research and Predictability EXperiment) legacy activities. The aim of PPP, a ten year endeavour (2013-2022), is to promote cooperative international research enabling development of improved weather and environmental prediction services for the polar regions, on hourly to seasonal time scales. In order to achieve its goals, PPP will enhance international and interdisciplinary collaboration through the development of strong linkages with related initiatives; strengthen linkages between academia, research institutions and operational forecasting centres; promote interactions and communication between research and stakeholders; and foster education and outreach. Flagship research activities of PPP include sea ice prediction, polar-lower latitude linkages and the Year of Polar Prediction (YOPP) - an intensive observational, coupled modelling, service-oriented research and educational effort in the period mid-2017 to mid-2019

Central Archive at the University of Reading

Electronic Publication Information Center

Storms and the formation of high salinity shelf water (HSSW) in Antarctic polynyas

Author: Boeira Dias Fabio
Nie Yafei
Uotila Petteri
Publication venue: Ohio Academy of Science
Publication date: 26/07/2021
Field of study

Helsingin yliopiston digitaalinen arkisto

Modelling the Eddy Pattern in the harbour of Zeebrugge

Author: Breugem Alexander
Da Silva Julien
Decrop Boudewijn
Martens Chantal
van Holland Gijsbert
Publication venue: Bundesanstalt fur Wasserbau
Publication date: 01/01/2013
Field of study

Hydrodynamic

Ghent University Academic Bibliography

Hydraulic Engineering Repository

Open Marine Archive

Teaching and Learning of Fluid Mechanics, Volume II

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

This book is devoted to the teaching and learning of fluid mechanics. Fluid mechanics occupies a privileged position in the sciences; it is taught in various science departments including physics, mathematics, mechanical, chemical and civil engineering and environmental sciences, each highlighting a different aspect or interpretation of the foundation and applications of fluids. While scholarship in fluid mechanics is vast, expanding into the areas of experimental, theoretical and computational fluid mechanics, there is little discussion among scientists about the different possible ways of teaching this subject. We think there is much to be learned, for teachers and students alike, from an interdisciplinary dialogue about fluids. This volume therefore highlights articles which have bearing on the pedagogical aspects of fluid mechanics at the undergraduate and graduate level

Directory of Open Access Books (DOAB)