Search CORE

5 research outputs found

Analyse statique/dynamique pour la validation et l'amélioration des applications parallèles multi-modèles

Author: Saillard Emmanuelle
Publication venue: HAL CCSD
Publication date: 24/09/2015
Field of study

Supercomputing plays an important role in several innovative fields, speeding up prototyping or validating scientific theories. However, supercomputers are evolving rapidly with now millions of processing units, posing the questions of their programmability. Despite the emergence of more widespread and functional parallel programming models, developing correct and effective parallel applications still remains a complex task. Although debugging solutions have emerged to address this issue, they often come with restrictions. However programming model evolutions stress the requirement for a convenient validation tool able to handle hybrid applications. Indeed as current scientific applications mainly rely on the Message Passing Interface (MPI) parallel programming model, new hardwares designed for Exascale with higher node-level parallelism clearly advocate for an MPI+X solutions with X a thread-based model such as OpenMP. But integrating two different programming models inside the same application can be error-prone leading to complex bugs - mostly detected unfortunately at runtime. In an MPI+X program not only the correctness of MPI should be ensured but also its interactions with the multi-threaded model, for example identical MPI collective operations cannot be performed by multiple nonsynchronized threads. This thesis aims at developing a combination of static and dynamic analysis to enable an early verification of hybrid HPC applications. The first pass statically verifies the thread level required by an MPI+OpenMP application and outlines execution paths leading to potential deadlocks. Thanks to this analysis, the code is selectively instrumented, displaying an error and synchronously interrupting all processes if the actual scheduling leads to a deadlock situation.L’utilisation du parallélisme des architectures actuelles dans le domaine du calcul hautes performances, oblige à recourir à différents langages parallèles. Ainsi, l’utilisation conjointe de MPI pour le parallélisme gros grain, à mémoire distribuée et OpenMP pour du parallélisme de thread, fait partie des pratiques de développement d’applications pour supercalculateurs. Des erreurs, liées à l’utilisation conjointe de ces langages de parallélisme, sont actuellement difficiles à détecter et cela limite l’écriture de codes, permettant des interactions plus poussées entre ces niveaux de parallélisme. Des outils ont été proposés afin de palier ce problème. Cependant, ces outils sont généralement focalisés sur un type de modèle et permettent une vérification dite statique (à la compilation) ou dynamique (à l’exécution). Pourtant une combinaison statique/- dynamique donnerait des informations plus pertinentes. En effet, le compilateur est en mesure de donner des informations relatives au comportement général du code, indépendamment du jeu d’entrée. C’est par exemple le cas des problèmes liés aux communications collectives du modèle MPI. Cette thèse a pour objectif de développer des analyses statiques/dynamiques permettant la vérification d’une application parallèle mélangeant plusieurs modèles de programmation, afin de diriger les développeurs vers un code parallèle multi-modèles correct et performant. La vérification se fait en deux étapes. Premièrement, de potentielles erreurs sont détectées lors de la phase de compilation. Ensuite, un test au runtime est ajouté pour savoir si le problème va réellement se produire. Grâce à ces analyses combinées, nous renvoyons des messages précis aux utilisateurs et évitons les situations de blocage

Thèses en Ligne

Theses.fr

The MPI BUGS INITIATIVE: a Framework for MPI Verification Tools Evaluation

Author: Laurent Mathieu
Quinson Martin
Saillard Emmanuelle
Publication venue: HAL CCSD
Publication date: 19/11/2021
Field of study

International audienceEnsuring the correctness of MPI programs becomes as challenging and important as achieving the best performance. Many tools have been proposed in the literature to detect incorrect usages of MPI in a given program. However, the limited set of code samples each tool provides and the lack of metadata stating the intent of each test make it difficult to assess the strengths and limitations of these tools. In this paper, we present the MPI BUGS INITIATIVE, a complete collection of MPI codes to assess the status of MPI verification tools. We introduce a classification of MPI errors and provide correct and incorrect codes covering many MPI features and our categorization of errors. The resulting suite comprises 1,668 codes, each coming with a well-formatted header that clarifies the intent of each code and specifies how to execute and evaluate it. We evaluated the completeness of the MPI BUGS INITIATIVE against eight stateof-the-art MPI verification tools

INRIA a CCSD electronic archive server

PARCOACH: Combining static and dynamic validation of MPI collective communications

Author: Barthou Denis
Carribault Patrick
Saillard Emmanuelle
Publication venue: 'SAGE Publications'
Publication date: 01/01/2013
Field of study

International audienceNowadays most scientific applications are parallelized based on MPI communications. Collective MPI communications have to be executed in the same order by all processes in their communicator and the same number of times, otherwise it is not conforming to the standard and a deadlock or other undefined behavior can occur. As soon as the control-flow involving these collective operations becomes more complex, in particular including conditionals on process ranks, ensuring the correction of such code is error-prone. We propose in this paper a static analysis to detect when such situation occurs, combined with a code transformation that prevents from deadlocking. We focus on blocking MPI collective operations in SPMD applications, assuming MPI calls are not nested in multithreaded regions. We show on several benchmarks the small impact on performance and the ease of integration of our techniques in the development process

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-CEA

Exploration efficace de l'espace d'état pour les programmes distribués asynchrones: Adaptation de l’UDPOR aux programes MPI.

Author: Pham The Anh
Publication venue: HAL CCSD
Publication date: 06/12/2019
Field of study

Distributed message passing applications are in the mainstream of information technology since they exploit the power of parallel computer systems to produce higher performance. Designing distributed programs remains challenging because developers have to reason about concurrency, non-determinism, data distribution… that are main characteristics of distributed programs. Besides, it is virtually impossible to ensure the correctness of such programs via classical testing approaches since one may never successfully reach the execution that leads to unwanted behaviors in the programs. There is thus a need for more powerful verification techniques. Model-checking is one of the formal methods that allows to verify automatically and effectively some properties on models of computer systems by exploring all possible behaviors (states and transitions) of the system model. However, state spaces increase exponentially with the number of concurrent processes, leading to “state space explosion”.Unfolding-based Dynamic Partial Order Reduction (UDPOR) is a recent technique mixing Dynamic Partial Order Reduction (DPOR) with concepts of concurrency theory such as unfoldings to efficiently mitigate state space explosion in model-checking of concurrent programs. It is optimal in the sense that each Mazurkiewicz trace, i.e. a class of interleavings equivalent by commuting adjacent independent actions, is explored exactly once. And it is applicable to running programs, not only models of programs.The thesis aims at adapting UDPOR to verify asynchronous distributed programs (e.g. MPI programs) in the setting of the SIMGRID simulator of distributed applications. To do so, an abstract programming model of asynchronous distributed programs is defined and formalized in the TLA+ language, allowing to precisely define an independence relation, a main ingredient of the concurrency semantics. Then, the adaptation of UDPOR, involving the construction of an unfolding, is made efficient by a precise analysis of dependencies in the programming model, allowing efficient computations of usually costly operation. A prototype implementation of UDPOR adapted to distributed asynchronous programs has been developed, giving promising experimental results on a significant set of benchmarks.Distributed message passing applications are in the mainstream of information technology since they exploit the power of parallel computer systems to produce higher performance. Designing distributed programs remains challenging because developers have to reason about concurrency, non-determinism, data distribution… that are main characteristics of distributed programs. Besides, it is virtually impossible to ensure the correctness of such programs via classical testing approaches since one may never successfully reach the execution that leads to unwanted behaviors in the programs. There is thus a need for more powerful verification techniques. Model-checking is one of the formal methods that allows to verify automatically and effectively some properties on models of computer systems by exploring all possible behaviors (states and transitions) of the system model. However, state spaces increase exponentially with the number of concurrent processes, leading to “state space explosion”.Unfolding-based Dynamic Partial Order Reduction (UDPOR) is a recent technique mixing Dynamic Partial Order Reduction (DPOR) with concepts of concurrency theory such as unfoldings to efficiently mitigate state space explosion in model-checking of concurrent programs. It is optimal in the sense that each Mazurkiewicz trace, i.e. a class of interleavings equivalent by commuting adjacent independent actions, is explored exactly once. And it is applicable to running programs, not only models of programs.The thesis aims at adapting UDPOR to verify asynchronous distributed programs (e.g. MPI programs) in the setting of the SIMGRID simulator of distributed applications. To do so, an abstract programming model of asynchronous distributed programs is defined and formalized in the TLA+ language, allowing to precisely define an independence relation, a main ingredient of the concurrency semantics. Then, the adaptation of UDPOR, involving the construction of an unfolding, is made efficient by a precise analysis of dependencies in the programming model, allowing efficient computations of usually costly operation. A prototype implementation of UDPOR adapted to distributed asynchronous programs has been developed, giving promising experimental results on a significant set of benchmarks

INRIA a CCSD electronic archive server

Efficient Simulation of Rare Events in one-dimensional systems using a parallelised cloning algorithm

Author: Brewer Tobias
Publication venue
Publication date: 01/05/2019
Field of study

OPUS