Search CORE

9 research outputs found

Optimizing an MPI weather forecasting model via processor virtualization

Author: Celso L. Mendes
Eduardo R. Rodrigues
Jairo Panetta
Laxmikant V. Kalé
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract—Weather forecasting models are computationally intensive applications. These models are typically executed in parallel machines and a major obstacle for their scalability is load imbalance. The causes of such imbalance are either static (e.g. topography) or dynamic (e.g. shortwave radiation, moving thunderstorms). Various techniques, often embedded in the application’s source code, have been used to address both sources. However, these techniques are inflexible and hard to use in legacy codes. In this paper, we demonstrate the effectiveness of processor virtualization for dynamically balancing the load in BRAMS, a mesoscale weather forecasting model based on MPI paral-lelization. We use the Charm++ infrastructure, with its over-decomposition and object-migration capabilities, to move sub-domains across processors during execution of the model. Pro-cessor virtualization enables better overlap between computation and communication and improved cache efficiency. Furthermore, by employing an appropriate load balancer, we achieve better processor utilization while requiring minimal changes to the model’s code. I

CiteSeerX

Crossref

Dynamic Load Balancing for Compressible Multiphase Turbulence

Author: Banerjee Tania
Hackl Jason
Ranka Sanjay
Zhai Keke
Zwick David
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/07/2018
Field of study

CMT-nek is a new scientific application for performing high fidelity predictive simulations of particle laden explosively dispersed turbulent flows. CMT-nek involves detailed simulations, is compute intensive and is targeted to be deployed on exascale platforms. The moving particles are the main source of load imbalance as the application is executed on parallel processors. In a demonstration problem, all the particles are initially in a closed container until a detonation occurs and the particles move apart. If all processors get an equal share of the fluid domain, then only some of the processors get sections of the domain that are initially laden with particles, leading to disparate load on the processors. In order to eliminate load imbalance in different processors and to speedup the makespan, we present different load balancing algorithms for CMT-nek on large scale multi-core platforms consisting of hundred of thousands of cores. The detailed process of the load balancing algorithms are presented. The performance of the different load balancing algorithms are compared and the associated overheads are analyzed. Evaluations on the application with and without load balancing are conducted and these show that with load balancing, simulation time becomes faster by a factor of up to

9.97

.Comment: This paper has been accepted by ACM International Conference on Supercomputing (ICS) 201

arXiv.org e-Print Archive

Crossref

Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers

Author: Abhinav Bhatele ́
Esteban Meneses
Gengbin Zheng
Laxmikant V. Kalé
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract — Large parallel machines with hundreds of thou-sands of processors are being built. Recent studies have shown that ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with relatively small amount of memory. Fully distributed load balancing algorithms, on the other hand, tend to yield poor load balance on very large machines. In this paper, we present an automatic dynamic hierarchical load balancing method that overcomes the scala-bility challenges of centralized schemes and poor solutions of traditional distributed schemes. This is done by creating multiple levels of aggressive load balancing domains which form a tree. This hierarchical method is demonstrated within a measurement-based load balancing framework in Charm++. We present techniques to deal with scalability challenges of load balancing at very large scale. We show performance data of the hierarchical load balancing method on up to 16,384 cores of Ranger (at TACC) for a synthetic benchmark. We also demonstrate the successful deployment of the method in a scientific application, NAMD with results on the Blue Gene/P machine at ANL. I

CiteSeerX

Crossref

Automated mapping of regular communication graphs on mesh interconnects

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Predicting application performance using supervised learning on communication features

Author: Bhatele A
Gamblin T
Jain N
Kale L V
Robson M P
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Abstract not provide

Crossref

UNT Digital Library

Smith College: Smith ScholarWorks

Lastbalancierungsverfahren für dynamische und heterogene Linked-Cell Molekülsimulation

Author: Hirschmann Steffen
Publication venue
Publication date: 01/01/2015
Field of study

In dieser Arbeit wird die Lastbalancierung von Molekül- beziehungsweise Teilchensimulationen mit kurzreichweitigen Potenzialen betrachtet. Eine solche ist notwendig, um inhomogene Szenarien effizient über längere Zeiträume hinweg auf Parallelrechnern simulieren zu können. Hierzu werden die vorkommenden Arten von Last analysiert und in sogenannten Lastmodellen quantifiziert. Hierbei liegt der Fokus auf Rechen- und Kommunikationslasten. Anschließend wird das Problem der Lastbalancierung beschrieben. Es werden verschiedene in der Literatur bekannte Verfahren zur Lastbalancierung betrachtet, untersucht und evaluiert. Der Fokus liegt hierbei nicht auf einem einzelnen Anwendungsszenario, sondern auf der generellen Machbarkeit, den Eigenschaften und den Einschränkungen der jeweiligen Verfahren

Dynamic load balancing of parallel road traffic simulation

Author: Igbe Damian
Publication venue
Publication date: 01/01/2010
Field of study

The objective of this research was to investigate, develop and evaluate dynamic load-balancing strategies for parallel execution of microscopic road traffic simulations. Urban road traffic simulation presents irregular, and dynamically varying distributed computational load for a parallel processor system. The dynamic nature of road traffic simulation systems lead to uneven load distribution during simulation, even for a system that starts off with even load distributions. Load balancing is a potential way of achieving improved performance by reallocating work from highly loaded processors to lightly loaded processors leading to a reduction in the overall computational time. In dynamic load balancing, workloads are adjusted continually or periodically throughout the computation. In this thesis load balancing strategies were evaluated and some load balancing policies developed. A load index and a profitability determination algorithms were developed. These were used to enhance two load balancing algorithms. One of the algorithms exhibits local communications and distributed load evaluation between the neighbour partitions (diffusion algorithm) and the other algorithm exhibits both local and global communications while the decision making is centralized (MaS algorithm). The enhanced algorithms were implemented and synthesized with a research parallel traffic simulation. The performance of the research parallel traffic simulator, optimized with the two modified dynamic load balancing strategies were studied.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository