49 research outputs found
Relaxed Synchronization with Ordered Read-Write Locks
This article is accepted for publication in the post-proceedings of the Workshop on Algorithms and Programming Tools for Next-Generation High-Performance Scientific Software (HPSS) 2011, held in the context of Euro-Par 2011, August 29, 2011, Bordeaux, France.International audienceThis paper promotes the first stand-alone implementation of our adaptive tool for synchronization ''ordered read-write locks'', ORWL. It provides new synchronization methods for resource oriented parallel or distributed algorithms for which it allows an implicit deadlock-free and equitable control of a protected resource and provides means to couple lock objects and data tightly. A typical application that uses this framework will run a number of loosely coupled tasks that are exclusively regulated by the data flow. We conducted experiments to prove the validity, efficiency and scalability of our implementation.Ce papier présente la première implantation directe de notre outil adaptatif de synchronisation, " ordered read-write locks " (ORWL). Il fournit des méthodes nouvelles de synchronisation pour des algorithmes parallèle ou reparties orientés ressources. Il permet un contrôle implicite d'une ressource protégée qui est équitable et sans interblocage et fournit des moyens de coupler des objets de verrou et des données de façon stricte. Une application typique qui utilise ce cadre va effectuer un nombre de tâches couplées de façon relâchée et qui seraient exclusivement régulés par le flot des données. Nous avons conduit des expériences qui prouvent la validité, l'efficacité et le passage à l'échelle de notre implantation
Kadeploy3: Efficient and Scalable Operating System Provisioning for HPC Clusters
Operating system provisioning is a common and critical task in cluster computing environments. The required low-level operations involved in provisioning can drastically decrease the performance of a given solution, and maintaining a reasonable provisioning time on clusters of 1000+ nodes is a significant challenge. We present Kadeploy3, a tool built to efficiently and reliably deploy a large number of cluster nodes. Since it is a keystone of the Grid'5000 experimental testbed, it has been designed not only to help system administrators install and manage clusters but also to provide testbed users with a flexible way to deploy their own operating systems on nodes for their own experimentation needs, on a very frequent basis. In this paper we detail the design principles of Kadeploy3 and its main features, and evaluate its capabilities in several contexts. We also share the lessons we have learned during the design and deployment of Kadeploy3 in the hope that this will help system administrators and developers of similar solutions
Emulation at Very Large Scale with Distem
International audienceProspective exascale systems and large-scale cloud infrastructures are composed of dozens of thousands of nodes. Evaluating applications that target such environments is extremely difficult. In this paper, we present an extension of the Distem emulator to allow experimenting on very large scale emulated platforms thanks to the use of a VXLAN overlay network. We demonstrate that Distem is capable of emulating 40,000 virtual nodes on 168 physical nodes, and use the resulting emulated environment to compare two efficient parallel command runners: TakTuk and ClusterShell
Efficient and Scalable OS Provisioning with Kadeploy 3
National audienceKadeploy3 est un logiciel permettant de déployer de manière efficace et fiable des ensembles de machines, notamment dans le contexte des clusters de calcul à haute performance. Kadeploy3 permet de déployer des milliers de machines grâce à l'utilisation de mécanismes de diffusion d'image et d'exécution de commande particulièrement optimisés pour la grande échelle. Il permet aussi de résister correctement aux pannes et erreurs inévitables à cette échelle grâce à un moteur de workflow proposant des mécanismes de reprise sur erreur. Une grande attention est également portée à l'utilisabilité et à l'adaptabilité de Kadeploy3, avec notamment la gestion d'une bibliothèque d'environnements, et une gestion fine des droits. Le poster détaille différents aspects de Kadeploy, notamment: - ses fonctionnalités principales ; - sa capacité à passer à l'échelle ; - sa résistance aux pannes ; - quelques éléments d'évaluation. Kadeploy est diffusé sous licence libre, et est activement développé et maintenu par Inria Nancy Grand-Est
Kadeploy3: Efficient and Scalable Operating System Provisioning for Clusters
International audienceInstalling an operating system can be very tedious when it must be repro- duced on several computers, for instance, on large scale clusters. Since it is not realistic to install the nodes independently, disk cloning or imaging with tools such as Clonezilla[1], Rocks [5], SystemImager [6] or xCAT [8] is a com- mon approach. In that case the administrator must keep updated only one node (sometimes called golden node), that will be replicated to other nodes. This article presents Kadeploy3, a tool designed to perform operating system provi- sioning using disk imaging and cloning. Thanks to its e ciency, scalability, and reliability, it is particularly suited for large scale clusters
Performance evaluation of containers for HPC
International audienceContainer-based virtualization technologies such as LXC or Docker have gained a lot of interest recently, especially in the HPC context where they could help to address a number of long-running issues. Even if they have proven to perform better than full-fledged, hypervisor-based, virtualization solutions, there are still a lot of questions about the use of container solutions in the HPC context. This paper evaluates the performance of Linux-based container solutions that rely on cgroups and namespaces using the NAS parallel benchmarks, in various configurations. We show that containers technology has matured over the years, and that performance issues are being solved
Porting the Distem Emulator to the CloudLab and Chameleon testbeds
The Distem emulator was designed in the context of the Grid'5000testbed. In this paper, we describe the experience of porting Distem totwo testbeds: CloudLab and Chameleon, in order to uncover possibleissues when deploying Distem on platforms different from Grid'5000. Italso provides some insight on differences between the design of each ofthose three testbeds, and their impact on experimenters
Grid'5000: A Production-grade Testbed for Experiment-driven Computer Science on HPC and Clouds
International audienceThis poster was presented on the Inria booth at SuperComputing'1
Grid'5000: A Production-grade Testbed for Experiment-driven Computer Science on HPC and Clouds
International audienceThis poster was presented on the Inria booth at SuperComputing'1
Design and Evaluation of a Virtual Experimental Environment for Distributed Systems
International audienceBetween simulation and experiments on real-scale testbeds, the combined use of emulation and virtualization provide a useful alternative for performing experiments on distributed systems such as clusters, grids, cloud computing or P2P systems. In this paper, we present Distem, a software tool to build distributed virtual experimental environments. Using an homogenenous set of nodes, Distem emulates a platform composed of heterogeneous nodes (in terms of number and performance of CPU cores), connected to a virtual network described using a realistic topology model. Distem relies on LXC, a low-overhead container-based virtualization solution, to achieve scalability and enable experiments with thousands of virtual nodes. Distem provides a set of user interfaces to accomodate different needs (command-line for interactive use, Ruby and REST APIs), is freely available and well documented. After a detailed description of Distem, we perform an experimental evaluation of several of its features.Entre la simulation et l'expérimentation sur des plates-formes réelles, l'usage combiné de l'émulation et de la virtualisation fournit une alternative utile pour réaliser des expériences sur des systèmes distribués tels que les clusters, grilles, le Cloud ou les systèmes P2P. Dans cet article, nous présentons Distem, un logiciel permettant de construire des environnements expérimentaux distribués virtuels. À partir d'un ensemble homogène de noeuds, Distem émule une plate-forme composée de noeuds hétérogènes (en termes de nombre et de performance de leurs coeurs CPU), connectés à un réseau virtuel décrit à partir d'un modèle de topologies réaliste. Distem se base sur LXC, une solution de virtualisation légère à base de conteneurs, pour obtenir des propriétés de passage à l'échelle satisfaisantes et permettre des expériences avec des milliers de noeuds virtuels. Distem fournit plusieurs interfaces utilisateurs permettant de s'adapter à différents besoins (ligne de commande pour l'usage interactif, Ruby, API REST), est librement disponible et bien documenté. Après une description détaillée de Distem, cet article présente une validation expérimentale de plusieurs de ses fonctionnalités