5 research outputs found

    Analyse statique/dynamique pour la validation et l'amélioration des applications parallÚles multi-modÚles

    Get PDF
    Supercomputing plays an important role in several innovative fields, speeding up prototyping or validating scientific theories. However, supercomputers are evolving rapidly with now millions of processing units, posing the questions of their programmability. Despite the emergence of more widespread and functional parallel programming models, developing correct and effective parallel applications still remains a complex task. Although debugging solutions have emerged to address this issue, they often come with restrictions. However programming model evolutions stress the requirement for a convenient validation tool able to handle hybrid applications. Indeed as current scientific applications mainly rely on the Message Passing Interface (MPI) parallel programming model, new hardwares designed for Exascale with higher node-level parallelism clearly advocate for an MPI+X solutions with X a thread-based model such as OpenMP. But integrating two different programming models inside the same application can be error-prone leading to complex bugs - mostly detected unfortunately at runtime. In an MPI+X program not only the correctness of MPI should be ensured but also its interactions with the multi-threaded model, for example identical MPI collective operations cannot be performed by multiple nonsynchronized threads. This thesis aims at developing a combination of static and dynamic analysis to enable an early verification of hybrid HPC applications. The first pass statically verifies the thread level required by an MPI+OpenMP application and outlines execution paths leading to potential deadlocks. Thanks to this analysis, the code is selectively instrumented, displaying an error and synchronously interrupting all processes if the actual scheduling leads to a deadlock situation.L’utilisation du parallĂ©lisme des architectures actuelles dans le domaine du calcul hautes performances, oblige Ă  recourir Ă  diffĂ©rents langages parallĂšles. Ainsi, l’utilisation conjointe de MPI pour le parallĂ©lisme gros grain, Ă  mĂ©moire distribuĂ©e et OpenMP pour du parallĂ©lisme de thread, fait partie des pratiques de dĂ©veloppement d’applications pour supercalculateurs. Des erreurs, liĂ©es Ă  l’utilisation conjointe de ces langages de parallĂ©lisme, sont actuellement difficiles Ă  dĂ©tecter et cela limite l’écriture de codes, permettant des interactions plus poussĂ©es entre ces niveaux de parallĂ©lisme. Des outils ont Ă©tĂ© proposĂ©s afin de palier ce problĂšme. Cependant, ces outils sont gĂ©nĂ©ralement focalisĂ©s sur un type de modĂšle et permettent une vĂ©rification dite statique (Ă  la compilation) ou dynamique (Ă  l’exĂ©cution). Pourtant une combinaison statique/- dynamique donnerait des informations plus pertinentes. En effet, le compilateur est en mesure de donner des informations relatives au comportement gĂ©nĂ©ral du code, indĂ©pendamment du jeu d’entrĂ©e. C’est par exemple le cas des problĂšmes liĂ©s aux communications collectives du modĂšle MPI. Cette thĂšse a pour objectif de dĂ©velopper des analyses statiques/dynamiques permettant la vĂ©rification d’une application parallĂšle mĂ©langeant plusieurs modĂšles de programmation, afin de diriger les dĂ©veloppeurs vers un code parallĂšle multi-modĂšles correct et performant. La vĂ©rification se fait en deux Ă©tapes. PremiĂšrement, de potentielles erreurs sont dĂ©tectĂ©es lors de la phase de compilation. Ensuite, un test au runtime est ajoutĂ© pour savoir si le problĂšme va rĂ©ellement se produire. GrĂące Ă  ces analyses combinĂ©es, nous renvoyons des messages prĂ©cis aux utilisateurs et Ă©vitons les situations de blocage

    The MPI BUGS INITIATIVE: a Framework for MPI Verification Tools Evaluation

    Get PDF
    International audienceEnsuring the correctness of MPI programs becomes as challenging and important as achieving the best performance. Many tools have been proposed in the literature to detect incorrect usages of MPI in a given program. However, the limited set of code samples each tool provides and the lack of metadata stating the intent of each test make it difficult to assess the strengths and limitations of these tools. In this paper, we present the MPI BUGS INITIATIVE, a complete collection of MPI codes to assess the status of MPI verification tools. We introduce a classification of MPI errors and provide correct and incorrect codes covering many MPI features and our categorization of errors. The resulting suite comprises 1,668 codes, each coming with a well-formatted header that clarifies the intent of each code and specifies how to execute and evaluate it. We evaluated the completeness of the MPI BUGS INITIATIVE against eight stateof-the-art MPI verification tools

    PARCOACH: Combining static and dynamic validation of MPI collective communications

    Get PDF
    International audienceNowadays most scientific applications are parallelized based on MPI communications. Collective MPI communications have to be executed in the same order by all processes in their communicator and the same number of times, otherwise it is not conforming to the standard and a deadlock or other undefined behavior can occur. As soon as the control-flow involving these collective operations becomes more complex, in particular including conditionals on process ranks, ensuring the correction of such code is error-prone. We propose in this paper a static analysis to detect when such situation occurs, combined with a code transformation that prevents from deadlocking. We focus on blocking MPI collective operations in SPMD applications, assuming MPI calls are not nested in multithreaded regions. We show on several benchmarks the small impact on performance and the ease of integration of our techniques in the development process

    Exploration efficace de l'espace d'Ă©tat pour les programmes distribuĂ©s asynchrones: Adaptation de l’UDPOR aux programes MPI.

    Get PDF
    Distributed message passing applications are in the mainstream of information technology since they exploit the power of parallel computer systems to produce higher performance. Designing distributed programs remains challenging because developers have to reason about concurrency, non-determinism, data distribution
 that are main characteristics of distributed programs. Besides, it is virtually impossible to ensure the correctness of such programs via classical testing approaches since one may never successfully reach the execution that leads to unwanted behaviors in the programs. There is thus a need for more powerful verification techniques. Model-checking is one of the formal methods that allows to verify automatically and effectively some properties on models of computer systems by exploring all possible behaviors (states and transitions) of the system model. However, state spaces increase exponentially with the number of concurrent processes, leading to “state space explosion”.Unfolding-based Dynamic Partial Order Reduction (UDPOR) is a recent technique mixing Dynamic Partial Order Reduction (DPOR) with concepts of concurrency theory such as unfoldings to efficiently mitigate state space explosion in model-checking of concurrent programs. It is optimal in the sense that each Mazurkiewicz trace, i.e. a class of interleavings equivalent by commuting adjacent independent actions, is explored exactly once. And it is applicable to running programs, not only models of programs.The thesis aims at adapting UDPOR to verify asynchronous distributed programs (e.g. MPI programs) in the setting of the SIMGRID simulator of distributed applications. To do so, an abstract programming model of asynchronous distributed programs is defined and formalized in the TLA+ language, allowing to precisely define an independence relation, a main ingredient of the concurrency semantics. Then, the adaptation of UDPOR, involving the construction of an unfolding, is made efficient by a precise analysis of dependencies in the programming model, allowing efficient computations of usually costly operation. A prototype implementation of UDPOR adapted to distributed asynchronous programs has been developed, giving promising experimental results on a significant set of benchmarks.Distributed message passing applications are in the mainstream of information technology since they exploit the power of parallel computer systems to produce higher performance. Designing distributed programs remains challenging because developers have to reason about concurrency, non-determinism, data distribution
 that are main characteristics of distributed programs. Besides, it is virtually impossible to ensure the correctness of such programs via classical testing approaches since one may never successfully reach the execution that leads to unwanted behaviors in the programs. There is thus a need for more powerful verification techniques. Model-checking is one of the formal methods that allows to verify automatically and effectively some properties on models of computer systems by exploring all possible behaviors (states and transitions) of the system model. However, state spaces increase exponentially with the number of concurrent processes, leading to “state space explosion”.Unfolding-based Dynamic Partial Order Reduction (UDPOR) is a recent technique mixing Dynamic Partial Order Reduction (DPOR) with concepts of concurrency theory such as unfoldings to efficiently mitigate state space explosion in model-checking of concurrent programs. It is optimal in the sense that each Mazurkiewicz trace, i.e. a class of interleavings equivalent by commuting adjacent independent actions, is explored exactly once. And it is applicable to running programs, not only models of programs.The thesis aims at adapting UDPOR to verify asynchronous distributed programs (e.g. MPI programs) in the setting of the SIMGRID simulator of distributed applications. To do so, an abstract programming model of asynchronous distributed programs is defined and formalized in the TLA+ language, allowing to precisely define an independence relation, a main ingredient of the concurrency semantics. Then, the adaptation of UDPOR, involving the construction of an unfolding, is made efficient by a precise analysis of dependencies in the programming model, allowing efficient computations of usually costly operation. A prototype implementation of UDPOR adapted to distributed asynchronous programs has been developed, giving promising experimental results on a significant set of benchmarks
    corecore