Search CORE

17 research outputs found

Weather Projections and Dynamical Downscaling for the Republic of Panama: Evaluation of Implementation Methods via GPGPU Acceleration

Author: Fábrega José
Müller Michel
Nakaegawa Toshiyuki
Pinzón Reinhardt
Sánchez-Galán Javier
Ujaldon-Martínez Manuel
Publication venue
Publication date: 01/09/2020
Field of study

Climate change could have a critical impact on the Republic of Panama where a major segment of the economy is dependent on the operation of the Panama Canal. New capabilities to do targeted research around climate change impacts on Panama is therefore being established. This includes anew GPU-cluster infrastructure called Iberogun, based around 2 DGX1 servers (each running 16 NVIDIA Tesla P100 GPUs). This infrastructure will be used to evaluate potential climate models and models of extreme weather events. In this review we therefore present an evaluation of the GPGPU (General Purpose Graphic Processing Unit, here abbreviated GPU) implementation methods for the study of weather projections and dynamical downscaling in the Republic of Panama. Different methods are discussed, including: domain-specific languages (DSLs), directive-based porting methods, granularity optimization methods, and memory layout transforming methods. One of these approaches that has yielded interesting previous results is further discussed, a directive-based code transformation method called ‘Hybrid Fortran’ that permits a high-performance GPU port for arranged lattice Fortran codes. Finally, we suggest a method akin to previous investigations related to climate change done for the Republic of Panama, but with acceleration via GPU capabilities.We acknowledge a scientific fund from Sistema Nacional de Investigación de Panamá (SNI) and Projects: FID- 2016-275 and EIE-2018-16 of Convocatorias públicas of Secretaria Nacional de Ciencia y Tecnología e Innovación (SENACYT). We acknowledge funds and support from JSPS Grant-in-Aid for Specially Promoted Research 16H06291. We acknowledge Theme C of the TOUGOU program granted by the Japanese Ministry of Education, Culture, Sports, Science and Technology. The authors thank the Universidad Tecnológica de Panamá their extensive support, and for the use of their CIHH-group HPC-Cluster-Iberogun. Also acknowledge to NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

Repositorio Institucional Universidad de Málaga

The ICON-A model for direct QBO simulations on GPUs (version icon-cscs:baf28a514)

Author: Adamidis P.
Alexeev D.
Clement V.
Dietlicher R.
Engels J.
Esch M.
Franke H.
Frauen C.
Giorgetta M.
Hannah W.
Hillman B.
Kornblueh L.
Lapillonne X.
Marti P.
Norman M.
Pincus R.
Rast S.
Reinert D.
Sawyer W.
Schnur R.
Schulzweida U.
Stevens B.
Publication venue: 'Copernicus GmbH'
Publication date: 16/09/2022
Field of study

Classical numerical models for the global atmosphere, as used for numerical weather forecasting or climate research, have been developed for conventional central processing unit (CPU) architectures. This hinders the employment of such models on current top-performing supercomputers, which achieve their computing power with hybrid architectures, mostly using graphics processing units (GPUs). Thus also scientific applications of such models are restricted to the lesser computer power of CPUs. Here we present the development of a GPU-enabled version of the ICON atmosphere model (ICON-A), motivated by a research project on the quasi-biennial oscillation (QBO), a global-scale wind oscillation in the equatorial stratosphere that depends on a broad spectrum of atmospheric waves, which originates from tropical deep convection. Resolving the relevant scales, from a few kilometers to the size of the globe, is a formidable computational problem, which can only be realized now on top-performing supercomputers. This motivated porting ICON-A, in the specific configuration needed for the research project, in a first step to the GPU architecture of the Piz Daint computer at the Swiss National Supercomputing Centre and in a second step to the JUWELS Booster computer at the Forschungszentrum Jülich. On Piz Daint, the ported code achieves a single-node GPU vs. CPU speedup factor of 6.4 and allows for global experiments at a horizontal resolution of 5 km on 1024 computing nodes with 1 GPU per node with a turnover of 48 simulated days per day. On JUWELS Booster, the more modern hardware in combination with an upgraded code base allows for simulations at the same resolution on 128 computing nodes with 4 GPUs per node and a turnover of 133 simulated days per day. Additionally, the code still remains functional on CPUs, as is demonstrated by additional experiments on the Levante compute system at the German Climate Computing Center. While the application shows good weak scaling over the tested 16-fold increase in grid size and node count, making also higher resolved global simulations possible, the strong scaling on GPUs is relatively poor, which limits the options to increase turnover with more nodes. Initial experiments demonstrate that the ICON-A model can simulate downward-propagating QBO jets, which are driven by wave–mean flow interaction

MPG.PuRe

GPU parallelization of a hybrid pseudospectral geophysical turbulence framework using CUDA

Author: Mininni Pablo Daniel
Pouquet Annick
Reddy Raghu
Rosenberg Duane
Publication venue: 'MDPI AG'
Publication date: 01/02/2020
Field of study

An existing hybrid MPI-OpenMP scheme is augmented with a CUDA-based fine grain parallelization approach for multidimensional distributed Fourier transforms, in a well-characterized pseudospectral fluid turbulence code. Basics of the hybrid scheme are reviewed, and heuristics provided to show a potential benefit of the CUDA implementation. The method draws heavily on the CUDA runtime library to handle memory management and on the cuFFT library for computing local FFTs. The manner in which the interfaces to these libraries are constructed, and ISO bindings utilized to facilitate platform portability, are discussed. CUDA streams are implemented to overlap data transfer with cuFFT computation. Testing with a baseline solver demonstrated significant aggregate speed-up over the hybrid MPI-OpenMP solver by offloading to GPUs on an NVLink-based test system. While the batch streamed approach provided little benefit with NVLink, we saw a performance gain of 30% when tuned for the optimal number of streams on a PCIe-based system. It was found that strong GPU scaling is nearly ideal, in all cases. Profiling of the CUDA kernels shows that the transform computation achieves 15% of the attainable peak FlOp-rate based on a roofline model for the system. In addition to speed-up measurements for the fiducial solver, we also considered several other solvers with different numbers of transform operations and found that aggregate speed-ups are nearly constant for all solvers.Fil: Rosenberg, Duane. State University of Colorado - Fort Collins; Estados UnidosFil: Mininni, Pablo Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Reddy, Raghu. Environmental Modeling Center; Estados UnidosFil: Pouquet, Annick. State University of Colorado at Boulder; Estados Unidos. National Center for Atmospheric Research; Estados Unido

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Evaluation of GPU Acceleration for WRF–SFIRE

Author: Benz Joshua
Publication venue: SJSU ScholarWorks
Publication date: 23/12/2021
Field of study

WRF–SFIRE is an open source, atmospheric–wildfire model that couples the WRF model with the level set fire spread model to simulate wildfires in real time. This model has many applications and more scientific questions can be asked and answered if the model can be run faster. Nvidia has put a lot of effort into easing the barrier of entry for accelerating applications with their tools to be run on GPUs. Various physical simulations have been successfully ported to utilize GPUs and have benefited from the speed increase. In this research, we take a look at WRF-SFIRE and try to use the Nvida tools to accelerate portions of code. We were successful in offloading work to the GPU. However, the WRF-SFIRE codebase contains too many data dependencies, deeply nested function calls and I/O to effectively utilize the GPU’s resources. We look at specific examples and try to run them on a Titan V GPU. In the end, the compute intensive portions of WRF-SFIRE need to be rewritten to avoid data dependencies in order to leverage GPUs to improve the execution time

SJSU ScholarWorks