27 research outputs found

    A Performance Evaluation Method for Climate Coupled Models

    Get PDF
    In the High Performance Computing context, the performance evaluation of a parallel algorithm is carried out mainly by considering the elapsed time for running the parallel application with both different number of cores and different problem sizes (for scaled speedup). Typically, parallel applications embed mechanisms to efficiently use the allocated resources, guaranteeing for example a good load balancing and reducing the parallel overhead. Unfortunately, this assumption is not true for coupled models. These models were born from the coupling of stand-alone climate models. The component models are developed independently from each other and they follow different development roadmaps. Moreover, they are characterized by different levels of parallelization as well as different requirements in terms of workload and they have their own scalability curve. Considering a coupled model as a single parallel application, we can note the lacking of a policy useful to balance the computational load on the available resources. This work tries to address the issues related to the performance evaluation of a coupled model as well as answering the following questions: once a given number of processors has been allocated for the whole coupled model, how does the run have to be configured in order to balance the workload? How many processors must be assigned to each of the component models? The methodology here described has been applied to evaluate the scalability of the CMCC-MED coupled model designed by the ANS Division of the CMCC. The evaluation has been carried out on two different computational architectures: a scalar cluster, based on IBM Power6 processors, and a vector cluster, based on NEC-SX9 processors

    Parallel implementation of the SHYFEM (System of HydrodYnamic Finite Element Modules) model

    Get PDF
    This paper presents the message passing interface (MPI)-based parallelization of the three-dimensional hydrodynamic model SHYFEM (System of HydrodYnamic Finite Element Modules). The original sequential version of the code was parallelized in order to reduce the execution time of high-resolution configurations using state-of-the-art high-performance computing (HPC) systems. A distributed memory approach was used, based on the MPI. Optimized numerical libraries were used to partition the unstructured grid (with a focus on load balancing) and to solve the sparse linear system of equations in parallel in the case of semi-to-fully implicit time stepping. The parallel implementation of the model was validated by comparing the outputs with those obtained from the sequential version. The performance assessment demonstrates a good level of scalability with a realistic configuration used as benchmark

    The NEMO Oceanic Model: Computational Performance Analysis and Optimization

    No full text
    The NEMO (Nucleus for European Modeling of the Ocean) oceanic model is one of the most widely used by the climate community. It is exploited with different configurations in more than 50 research projects for both long and short-term simulations. Computational requirements of the model and its implementation limit the exploitation of the emerging computational infrastructure at peta and exascale. A deep revision and analysis of the model and its implementation were needed. The paper describes the performance evaluation of the last release of the model, based on MPI parallelization, on the MareNostrum platform at the Barcelona Supercomputing Centre. The analysis of the scalability has been carried out taking into account different factors, i.e. the I/O system available on the platform, the domain decomposition of the model and the level of the parallelism. The analysis highlighted different bottlenecks due to the communication overhead. The code has been optimized reducing the communication weight within some frequently called functions and the parallelization has been improved introducing a second level of parallelism based on the OpenMP shared memory paradigm

    Experience on the parallelization of the OASIS3 coupler

    No full text
    This work describes the optimization and paralleliza- tion of the OASIS3 coupler. Performance evaluation and profiling have been carried out by means of the CMCC-MED coupled model, developed at the Euro- Mediterranean Centre for Climate Change (CMCC) and currently running on a NEC SX9 cluster. Our experiments highlighted that extrapolation (accom- plished by the extrap function) and interpolation (im- plemented from the scriprmp function) transforma- tions take the most time. Optimization concerned I/O operations reducing coupling time by 27%. Paral- lelization of OASIS3 represents a further step towards overall improvement of the whole coupled model. Our proposed parallel approach distributes fields over a pool of available processes. Each process applies cou- pling transformations to its assigned fields. This ap- proach restricts parallelization level to the number of coupling fields. However, it can be fully combined with a parallelization approach considering the geo- graphical domain distribution. Finally a quantitative comparison of the parallel coupler with the OASIS3 pseudo-parallel version is proposed

    Industrial Problem Optimization in a Grid Environment

    No full text
    The goal of the paper is to show how the Grid and its infrastructure can provide a valid support for reducing time and costs related to the execution of a generic optimization process in industrial world. An optimization model, the micro-GA algorithm, based on Genetic Algorithm theory, has been thought as the combination of a central generic optimization manager and several specific modules, characterized by the nature of the optimization problem, to be executed on a set of distributed and heterogeneous resources. A case study for the optimization of Diesel Engine performance, in terms of emission levels and fuel consumption, is presented as example of real applicability of the method. Finally, a prototypal implementation of described algorithm in a Grid Environment is provided

    The Roofline Model for Oceanic Climate Applications

    No full text
    The present work describes the analysis and optimisation of the PELAGOS025 configuration based on the coupling of the NEMO physic component of the ocean dynamics and the BFM (Biogeochemical Flux Model), a sophisticated biogeochemical model that can simulate both pelagic and benthic processes. The methodology here followed is characterised by the performance analysis of the original parallel code, in terms of strong scalability, the definition of the bottlenecks limiting the scalability when the number of processes increases, the analysis of the features of the most computational intensive kernels through the Roofline model which provides an insightful visual performance model for multicore architectures and which allows to measure and compare the performance of one or more computational kernels run on different hardware architectures
    corecore