Search CORE

34 research outputs found

Recommended from our members

AIMES: advanced computation and I/O methods for earth-system simulations

Author: AH Baker
B Barry
G Zängl
J Kunkel
J Kunkel
J Ziv
K Velten
M Folk
M Satoh
N Hübbe
N Jumah
N Jum’ah
P Bauer
P Lindstrom
R Rew
RA van Engelen
S Bauer
S Bauer
S Kronawitter
SV Adams
TJ Baron
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/07/2020
Field of study

Dealing with extreme scale Earth-system models is challenging from the computer science perspective, as the required computing power and storage capacity are steadily increasing. Scientists perform runs with growing resolution or aggregate results from many similar smaller-scale runs with slightly different initial conditions (the so-called ensemble runs). In the fifth Coupled Model Intercomparison Project (CMIP5), the produced datasets require more than three Petabytes of storage and the compute and storage requirements are increasing significantly for CMIP6. Climate scientists across the globe are developing next-generation models based on improved numerical formulation leading to grids that are discretized in alternative forms such as an icosahedral (geodesic) grid. The developers of these models face similar problems in scaling, maintaining and optimizing code. Performance portability and the maintainability of code are key concerns of scientists as, compared to industry projects, model code is continuously revised and extended to incorporate further levels of detail. This leads to a rapidly growing code base that is rarely refactored. However, code modernization is important to maintain productivity of the scientist working with the code and for utilizing performance provided by modern and future architectures. The need for performance optimization is motivated by the evolution of the parallel architecture landscape from homogeneous flat machines to heterogeneous combinations of processors with deep memory hierarchy. Notably, the rise of many-core, throughput-oriented accelerators, such as GPUs, requires non-trivial code changes at minimum and, even worse, may necessitate a substantial rewrite of the existing codebase. At the same time, the code complexity increases the difficulty for computer scientists and vendors to understand and optimize the code for a given system. Storing the products of climate predictions requires a large storage and archival system which is expensive. Often, scientists restrict the number of scientific variables and write interval to keep the costs balanced. Compression algorithms can reduce the costs significantly but can also increase the scientific yield of simulation runs. In the AIMES project, we addressed the key issues of programmability, computational efficiency and I/O limitations that are common in next-generation icosahedral earth-system models. The project focused on the separation of concerns between domain scientist, computational scientists, and computer scientists

Central Archive at the University of Reading

Crossref

Recommended from our members

GGDML: icosahedral models language extensions

Author: Dubos Thomas
Jumah Nabeeh
Kunkel Julian M.
Meurdesoif Thomas
Yashiro Hisashi
Zängl Günther
Publication venue: 'Cosmos Scholars Publishing House'
Publication date: 21/06/2017
Field of study

The optimization opportunities of a code base are not completely exploited by compilers. In fact, there are optimizations that must be done within the source code. Hence, if the code developers skip some details, some performance is lost. Thus, the use of a general-purpose language to develop a performance-demanding software -e.g. climate models- needs more care from the developers. They should take into account hardware details of the target machine. Besides, writing a high-performance code for one machine will have a lower performance on another one. The developers usually write multiple optimized sections or even code versions for the different target machines. Such codes are complex and hard to maintain. In this article we introduce a higher-level code development approach, where we develop a set of extensions to the language that is used to write a model’s code. Our extensions form a domain-specific language (DSL) that abstracts domain concepts and leaves the lower level details to a configurable source-to-source translation process. The purpose of the developed extensions is to support the icosahedral climate/atmospheric model development. We have started with the three icosahedral models: DYNAMICO, ICON, and NICAM. The collaboration with the scientists from the weather/climate sciences enabled agreed-upon extensions. When we have suggested an extension we kept in mind that it represents a higher-level domain-based concept, and that it carries no lower-level details. The introduced DSL (GGDML- General Grid Definition and Manipulation Language) hides optimization details like memory layout. It reduces code size of a model to less than one third its original size in terms of lines of code. The development costs of a model with GGDML are therefore reduced significantly

Central Archive at the University of Reading

Crossref

Cosmos Scholars Publishing House: Journals Management System

Weather Projections and Dynamical Downscaling for the Republic of Panama: Evaluation of Implementation Methods via GPGPU Acceleration

Author: Fábrega José
Müller Michel
Nakaegawa Toshiyuki
Pinzón Reinhardt
Sánchez-Galán Javier
Ujaldon-Martínez Manuel
Publication venue
Publication date: 01/09/2020
Field of study

Climate change could have a critical impact on the Republic of Panama where a major segment of the economy is dependent on the operation of the Panama Canal. New capabilities to do targeted research around climate change impacts on Panama is therefore being established. This includes anew GPU-cluster infrastructure called Iberogun, based around 2 DGX1 servers (each running 16 NVIDIA Tesla P100 GPUs). This infrastructure will be used to evaluate potential climate models and models of extreme weather events. In this review we therefore present an evaluation of the GPGPU (General Purpose Graphic Processing Unit, here abbreviated GPU) implementation methods for the study of weather projections and dynamical downscaling in the Republic of Panama. Different methods are discussed, including: domain-specific languages (DSLs), directive-based porting methods, granularity optimization methods, and memory layout transforming methods. One of these approaches that has yielded interesting previous results is further discussed, a directive-based code transformation method called ‘Hybrid Fortran’ that permits a high-performance GPU port for arranged lattice Fortran codes. Finally, we suggest a method akin to previous investigations related to climate change done for the Republic of Panama, but with acceleration via GPU capabilities.We acknowledge a scientific fund from Sistema Nacional de Investigación de Panamá (SNI) and Projects: FID- 2016-275 and EIE-2018-16 of Convocatorias públicas of Secretaria Nacional de Ciencia y Tecnología e Innovación (SENACYT). We acknowledge funds and support from JSPS Grant-in-Aid for Specially Promoted Research 16H06291. We acknowledge Theme C of the TOUGOU program granted by the Japanese Ministry of Education, Culture, Sports, Science and Technology. The authors thank the Universidad Tecnológica de Panamá their extensive support, and for the use of their CIHH-group HPC-Cluster-Iberogun. Also acknowledge to NVIDIA Corporation with the donation of the Titan Xp GPU used for this research

Repositorio Institucional Universidad de Málaga

Recommended from our members

Progress towards accelerating the unified model on hybrid multi-core systems

Author: Evans Katherine
Hill Adrian
Mahajan Salil
Manners James
Maynard Christopher
Morales-Hernandez Mario
Norman Matthew
Shipway Ben
Xu Min
Zhang Wei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/07/2021
Field of study

he cloud microphysics scheme, CASIM, and the radiation scheme, SOCRATES, are two computationally intensive parts within the Met Office’s Unified Model (UM). This study enables CASIM and SOCRATES to use accelerated multi-core systems for optimal com- putational performance of the UM. Using profiling to guide our efforts, we refactored the code for optimal threading and kernel arrangement and implemented OpenACC directives manually or through the CLAW source-to-source translator. Initial porting re- sults achieved 10.02x and 9.25x speedup in CASIM and SOCRATES respectively on 1 GPU compared with 1 CPU core. A granular per- formance analysis of the strategy and bottlenecks are discussed. These improvements will enable UM to run on heterogeneous com- puters and a path forward for further improvements is provided

Central Archive at the University of Reading

Parallel Implementation of Lossy Data Compression for Temporal Data Sets

Author: Agrawal Ankit
Choudhary Alok
Federrath Christoph
Hendrix William
Liao Wei-keng
Son Seung Woo
Yuan Zheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/03/2017
Field of study

Many scientific data sets contain temporal dimensions. These are the data storing information at the same spatial location but different time stamps. Some of the biggest temporal datasets are produced by parallel computing applications such as simulations of climate change and fluid dynamics. Temporal datasets can be very large and cost a huge amount of time to transfer among storage locations. Using data compression techniques, files can be transferred faster and save storage space. NUMARCK is a lossy data compression algorithm for temporal data sets that can learn emerging distributions of element-wise change ratios along the temporal dimension and encodes them into an index table to be concisely represented. This paper presents a parallel implementation of NUMARCK. Evaluated with six data sets obtained from climate and astrophysics simulations, parallel NUMARCK achieved scalable speedups of up to 8788 when running 12800 MPI processes on a parallel computer. We also compare the compression ratios against two lossy data compression algorithms, ISABELA and ZFP. The results show that NUMARCK achieved higher compression ratio than ISABELA and ZFP.Comment: 10 pages, HiPC 201

arXiv.org e-Print Archive

Crossref

The ICON-A model for direct QBO simulations on GPUs (version icon-cscs:baf28a514)

Author: Adamidis P.
Alexeev D.
Clement V.
Dietlicher R.
Engels J.
Esch M.
Franke H.
Frauen C.
Giorgetta M.
Hannah W.
Hillman B.
Kornblueh L.
Lapillonne X.
Marti P.
Norman M.
Pincus R.
Rast S.
Reinert D.
Sawyer W.
Schnur R.
Schulzweida U.
Stevens B.
Publication venue: 'Copernicus GmbH'
Publication date: 16/09/2022
Field of study

Classical numerical models for the global atmosphere, as used for numerical weather forecasting or climate research, have been developed for conventional central processing unit (CPU) architectures. This hinders the employment of such models on current top-performing supercomputers, which achieve their computing power with hybrid architectures, mostly using graphics processing units (GPUs). Thus also scientific applications of such models are restricted to the lesser computer power of CPUs. Here we present the development of a GPU-enabled version of the ICON atmosphere model (ICON-A), motivated by a research project on the quasi-biennial oscillation (QBO), a global-scale wind oscillation in the equatorial stratosphere that depends on a broad spectrum of atmospheric waves, which originates from tropical deep convection. Resolving the relevant scales, from a few kilometers to the size of the globe, is a formidable computational problem, which can only be realized now on top-performing supercomputers. This motivated porting ICON-A, in the specific configuration needed for the research project, in a first step to the GPU architecture of the Piz Daint computer at the Swiss National Supercomputing Centre and in a second step to the JUWELS Booster computer at the Forschungszentrum Jülich. On Piz Daint, the ported code achieves a single-node GPU vs. CPU speedup factor of 6.4 and allows for global experiments at a horizontal resolution of 5 km on 1024 computing nodes with 1 GPU per node with a turnover of 48 simulated days per day. On JUWELS Booster, the more modern hardware in combination with an upgraded code base allows for simulations at the same resolution on 128 computing nodes with 4 GPUs per node and a turnover of 133 simulated days per day. Additionally, the code still remains functional on CPUs, as is demonstrated by additional experiments on the Levante compute system at the German Climate Computing Center. While the application shows good weak scaling over the tested 16-fold increase in grid size and node count, making also higher resolved global simulations possible, the strong scaling on GPUs is relatively poor, which limits the options to increase turnover with more nodes. Initial experiments demonstrate that the ICON-A model can simulate downward-propagating QBO jets, which are driven by wave–mean flow interaction

MPG.PuRe

THOR 2.0: Major Improvements to the Open-Source General Circulation Model

Author: Deitrick Russell
Grimm Simon L.
Heng Kevin
Mendonça João M.
Schroffenegger Urs
Tsai Shang-Min
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2020
Field of study

THOR is the first open-source general circulation model (GCM) developed from scratch to study the atmospheres and climates of exoplanets, free from Earth- or Solar System-centric tunings. It solves the general non-hydrostatic Euler equations (instead of the primitive equations) on a sphere using the icosahedral grid. In the current study, we report major upgrades to THOR, building upon the work of Mendon\c{c}a et al. (2016). First, while the Horizontally Explicit Vertically Implicit (HEVI) integration scheme is the same as that described in Mendon\c{c}a et al. (2016), we provide a clearer description of the scheme and improved its implementation in the code. The differences in implementation between the hydrostatic shallow (HSS), quasi-hydrostatic deep (QHD) and non-hydrostatic deep (NHD) treatments are fully detailed. Second, standard physics modules are added: two-stream, double-gray radiative transfer and dry convective adjustment. Third, THOR is tested on additional benchmarks: tidally-locked Earth, deep hot Jupiter, acoustic wave, and gravity wave. Fourth, we report that differences between the hydrostatic and non-hydrostatic simulations are negligible in the Earth case, but pronounced in the hot Jupiter case. Finally, the effects of the so-called "sponge layer", a form of drag implemented in most GCMs to provide numerical stability, are examined. Overall, these upgrades have improved the flexibility, user-friendliness, and stability of THOR.Comment: 57 pages, 31 figures, revised, accepted for publication in ApJ

arXiv.org e-Print Archive

Bern Open Repository and Information System (BORIS)

Online Research Database In Technology