Search CORE

108 research outputs found

SimGrid: a Sustained Effort for the Versatile Simulation of Large Scale Distributed Systems

Author: Casanova Henri
Giersch Arnaud
Legrand Arnaud
Quinson Martin
Suter Frédéric
Publication venue
Publication date: 01/01/2013
Field of study

In this paper we present Simgrid, a toolkit for the versatile simulation of large scale distributed systems, whose development effort has been sustained for the last fifteen years. Over this time period SimGrid has evolved from a one-laboratory project in the U.S. into a scientific instrument developed by an international collaboration. The keys to making this evolution possible have been securing of funding, improving the quality of the software, and increasing the user base. In this paper we describe how we have been able to make advances on all three fronts, on which we plan to intensify our efforts over the upcoming years.Comment: 4 pages, submission to WSSSPE'1

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

HAL-IN2P3

HAL - Université de Franche-Comté

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

GRAS: a Research and Development framework for Grid services

Author: Quinson Martin
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

Grid platforms federate large numbers of resources across several organizations. While their promises are great, these platforms have proven challenging to use because of inherent heterogeneity and dynamic characteristics. Therefore, grid application development is possible only if robust distributed services infrastructures, e.g. for resource and data discovery, resource monitoring or application deployment, are available. These infrastructures, which are large-scaled distributed loosely-coupled applications, are very difficult to design, develop and tune.This paper presents the Grid Reality And Simulation (GRAS) framework that allows grid developers to first implement and experiment with such an infrastructure in simulation, benefiting from a controlled and fast environment. The infrastructure can then be deployed in situ without code modification. We first detail the design goals and the implementation of GRAS, and contrast them to the state of the art. We then present a case study to highlight the fundamentals of GRAS and illustrate its ease-of-use. In addition, we quantify the complexity of a code example using either GRAS or several other communication solutions. We also conduct tests over LAN and WAN networks to assess the performance. We find that the code using GRAS is simpler and shorter than any other solution while achieving better performance than most of the other solutions

INRIA a CCSD electronic archive server

SimGrid: a Generic Framework for Large-Scale Distributed Experiments

Author: Quinson Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/09/2009
Field of study

International audienceIn this paper we describe a comprehensive simulation framework, SimGrid, for the simulation of distributed applications on distributed platforms. Our goal is to describe the salient capabilities of SimGrid and explain how they allow users to perform simulations for a wide range of applications and platforms

Crossref

INRIA a CCSD electronic archive server

Dynamic Performance Forecasting for Network-Enabled Servers in a Heterogeneous Environment

Author: Desprez Frédéric
Quinson Martin
Suter Frédéric
Publication venue: HAL CCSD
Publication date: 01/01/2001
Field of study

This paper presents a tool for dynamic forecasting of Network-Enabled Servers performance. FAST (Fast Agent's System Timer}) is a software package allowing client applications to get an accurate forecast of communicat- ion and computation times and memory use in a heterogeneous environment. It relies on low level software packages, i.e., network and host monitoring tools, and some of our developments in computation routines modeling. The FAST internals and user interface are presented and a comparison between the execution time predicted by FAST and the measured time of complex matrix multiplication executed on an heterogeneous platform is given

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

GRAS: a Research and Development Framework for Grid and P2P Infrastructures

Author: Quinson Martin
Publication venue: HAL CCSD
Publication date: 01/11/2006
Field of study

International audienceDistributed service architectures are mandatory to handle the platform scale and dynamicity hindering the development of grid and P2P applications. These large-scaled distributed applications are difficult to design, develop and tune because of both theoretical and practical issues. This paper presents the GRAS framework that allows developers to first implement and experiment with such an infrastructure in simulation, benefiting from a controlled environment. The infrastructure can then be deployed in-situ without code modification. We detail our design goals, and contrast them with the state of the art. We study the exchange of a message (from the Pastry protocol) using either GRAS or several other solutions. We quantify both the code complexity and the performance and find that GRAS performs better according to both metrics

INRIA a CCSD electronic archive server

A Simple Model of Communication APIs – Application to Dynamic Partial-order Reduction

Author: Merz Stephan
Quinson Martin
Rosa Cristian Daniel
Publication venue: European Association of Software Science and Technology
Publication date: 03/05/2011
Field of study

We are interested in the verification, using model checking, of distributed programs that communicate asynchronously over standard communication APIs such as MPI. This is feasible only if the set of executions that the model checker explores is aggressively reduced to a subset of representative executions, using techniques such as dynamic partial-order reduction. We propose a small set of core primitives in terms of which such APIs can be defined and formally specify these primitives in TLA+. From this specification we derive theorems about the (in)dependence of invocations of the primitives, and use them in a DPOR-based verifier that runs within SimGrid, a simulation framework for distributed programming. Our preliminary experimental results indicate that we obtain good reductions, even though complex network operations are implemented in terms of the core commu nication primitives

Electronic Communications of the EASST (European Association of Software Science and Technology)

Assessing the Performance of MPI Applications Through Time-Independent Trace Replay

Author: Desprez Frédéric
Markomanolis George
Quinson Martin
Suter Frédéric
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceSimulation is a popular approach to obtain objective performance indicators platforms that are not at one's disposal. It may help the dimensioning of compute clusters in large computing centers. In this work we present a framework for the off-line simulation of MPI applications. Its main originality with regard to the literature is to rely on time-independent execution traces. This allows us to completely decouple the acquisition process from the actual replay of the traces in a simulation context. Then we are able to acquire traces for large application instances without being limited to an execution on a single compute cluster. Finally our framework is built on top of a scalable, fast, and validated simulation kernel. In this paper, we present the used time-independent trace format, investigate several acquisition strategies, detail the developed trace replay tool, and assess the quality of our simulation framework in terms of accuracy, acquisition time, simulation time, and trace size.La simulation est une approche très populaire pour obtenir des indicateurs de performances objectifs sur des plates-formes qui ne sont pas disponibles. Cela peut permettre le dimensionnement de grappes de calculs au sein de grands centres de calcul. Dans cet article nous présentons un outil de simulation post-mortem d'applications MPI. Sa principale originalité au regard de la littérature est d'utiliser des traces d'exécution indépendantes du temps. Cela permet de découpler intégralement le processus d'acquisition des traces de celui de rejeu dans un contexte de simulation. Il est ainsi possible d'obtenir des traces pour de grandes instances de problèmes sans être limité à des exécutions au sein d'une unique grappe. Enfin notre outil est développé au dessus d'un noyau de simulation scalable, rapide et validé. Cet article présente le format de traces indépendantes du temps utilisé, étudie plusieurs stratégies d'acquisition, détaille l'outil de rejeu que nous avons dévelopé, et evalué la qualité de nos simulations en termes de précision, temps d'acuisition, temps de simulation et tailles de traces

HAL-ENS-LYON

CiteSeerX

HAL-IN2P3

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

System-level State Equality Detection for the Dynamic Verification of Distributed Applications

Author: Guthmuller Marion
Quinson Martin
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/04/2014
Field of study

International audienceThis poster presents our solution to detect state equality of legacy MPI applications directly at system level, which is important to formally verify these applications

INRIA a CCSD electronic archive server

Byte-Range Asynchronous Locking in Distributed Settings

Author: Quinson Martin
Vernier Flavien
Publication venue: HAL CCSD
Publication date: 18/02/2009
Field of study

International audienceThis paper investigate a mutual exclusion algorithm on distributed systems. We introduce a new algorithm based on the Naimi-Trehel algorithm, taking advantage of the distributed approach of Naimi-Trehel while allowing to request partial locks. Such ranged locks offer a semantic close to POSIX file locking, where threads lock some parts of the shared file. We evaluate our algorithm by comparing its performance with to the original Naimi-Trehel algorithm and to a centralized mutual exclusion algorithm. The considered performance metric is the average time to obtain a lock

INRIA a CCSD electronic archive server

The Java Learning Machine: A Learning Management System Dedicated To Computer Science Education

Author: Oster Gérald
Quinson Martin
Publication venue: HAL CCSD
Publication date: 11/02/2011
Field of study

This paper presents the Java Learning Machine (JLM), a platform dedicated to computer programming education. This generic platform offers support to teachers for creating programming microworlds suitable to teaching courses. It features an integrated and graphical environment, providing a short feedback loop to students in order to improve the effectiveness of the autonomous learning process. This paper presents the motivations behind the platform and its main functionalities.Ce rapport présente la Java Learning Machine (JLM), une plate-forme dédiée à l'enseignement de la programmation. Cette plate-forme générique permet aux enseignants d'informatique de créer des micro-mondes utilisables dans leurs cours. Elle constitue un environnement graphique intégré, offrant aux apprenants d'obtenir un retour immédiat sur leur travail. Cela permet d'améliorer l'efficacité du processus d'apprentissage en autonomie. Ce rapport présente les motivations ayant mené à la création de la plate-forme, ainsi que les principales fonctionnalités de l'outil

INRIA a CCSD electronic archive server