Search CORE

47 research outputs found

Containers : A Sound Basis For a True Single System Image

Author: Lottiaux Renaud
Morin Christine
Publication venue: HAL CCSD
Publication date: 01/01/2000
Field of study

Clusters of SMPs are attractive for executing shared memory parallel applications but reconciling high performance and ease of programming remains an open issue. A possible approach is to provide an efficient Single System Image (SSI) operating system giving the illusion of an SMP machine. In this paper, we introduce the concept of container as a mechanism to unify global resource management at the lowest operating system level. Higher level operating system services such as virtual memory system and file cache can be easily implemented based on containers and transparently take benefit of the whole memory resource available in the cluster

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

Author: Feller Eugen
Leprince Daniel
Lottiaux Renaud
Morin Christine
Rilling Louis
Publication venue: HAL CCSD
Publication date: 27/09/2010
Field of study

International audienceIntelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Ghost Process: a Sound Basis to Implement Process Duplication, Migration and Checkpoint/Restart in Linux Clusters

Author: Berthou Jean-Yves
Lottiaux Renaud
Margery David
Morin Christine
Vallée Geoffroy
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

Process management mechanisms (process duplication, migration and checkpoint/restart) are very useful for high performance and high availability in clustering systems. The single system image approach aims at providing a global process management service with mechanisms for process checkpoint, process migration and process duplication. In this context, a common mechanism for process virtualization is highly desirable but traditional operating systems do not provide such a mecahnism. This paper presents a kernel service for process virtualization called ghost process, extending the Linux kernel. The ghost process mechanism has been implemented in the Kerrighed single system image based on Linux. \\ Les mécanismes de gestion de processus (duplication, migration et création de point de reprise/reprise de processus) sont particulièrement intéressants pour les systèmes pour grappes de calculateurs à haute performance et à haute disponibilité. L'un des buts des systèmes à image unique est d'offrir un service de gestion globale des processus fondé sur des mécamismes de création de points de reprise de processus, de migration de processus et de duplication de processus. Dans ce contexte, un mécanisme commun pour la virtualisation de processus est hautement bénéfique mais les systèmes d'exploitation traditionnels n'offrent pas un tel mécanisme. Ce document présente un service noyau pour la virtualisation de processus, appelé processus fantôme, fondé sur une extension du noyau Linux. Le mécanisme de processus fantôme a été mis en oeuvre dans le système à image unique Kerrughed fondé sur Linux

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Kerrighed: A SSI Cluster OS Running OpenMP

Author: Berthou Jean-Yves
Lottiaux Renaud
Margery David
Morin Christine
Vallée Geoffroy
Publication venue: HAL CCSD
Publication date: 01/01/2003
Field of study

Writing parallel programs for clusters of workstations is still a challenging task. In this paper, we present Kerrighed, a Single System Image (SSI) operating system giving the illusion of an SMP machine, and providing the standard posix thread interface to developers. It is therefore possible to use Kerrighed to run OpenMP programs compiled for SMP-machines using the posix thread interface. In this paper, we explain how we managed to achieve that goal, and present the benefits of providing OpenMP support through the SSI approach as opposed to a dedicated run-time environment

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

OpenMosix, OpenSSI and Kerrighed: A Comparative Study

Author: Boissinot Benoit
Gallard Pascal
Lottiaux Renaud
Morin Christine
Vallée Geoffroy
Publication venue: HAL CCSD
Publication date: 01/01/2004
Field of study

This paper presents a comparative study of Kerrighed, openMosix and OpenSSI, three Single System Image (SSI) operating systems for clusters. This experimental study gives an overview of SSI features offered by these SSI and evaluates performance of such features

HAL-ENS-LYON

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Reducing Kernel Development Complexity In Distributed Environments

Author: Focht Erich
Lebre Adrien
Lottiaux Renaud
Morin Christine
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

Setting up generic and fully transparent distributed services for clusters implies complex and tedious kernel developments. More flexible approaches such as user-space libraries are usually preferred with the drawback of requiring application recompilation. A second approach consists in using specific kernel modules (such as FUSE in Gnu/Linux system) to transfer kernel complexity into user space. In this paper, we present a new way to design and implement kernel distributed services for clusters by using a cluster wide consistent data management service. This system, entitled kDDM for "kernel Distributed Data Management", offers flexible kernel mechanisms to transparently manage remote accesses, cache and coherency. We show how kDDM simplifies distributed kernel developments by presenting the design and the implementation of a service as complex as a fully symmetric distributed file system. The innovative approach of kDDM has the potential to boost the development of distributed kernel services because it relieves the developers of the burden of dealing with distributed protocols and explicit data transfers. Instead, it allows focusing on the implementation of services in a manner very similar to that of parallel programming on SMP systems. More generally, the use of kDDM could be exploited in almost all local kernel services to extend them to cluster scale. Cluster wide IPC, distributed namespaces (such as /proc) or process migration are some potential examples

INRIA a CCSD electronic archive server

Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

Author: Feller Eugen
Leprince Daniel
Lottiaux Renaud
Morin Christine
Rilling Louis
Publication venue: HAL CCSD
Publication date: 27/09/2010
Field of study

Intelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster.Une consolidation intelligente des charges applicatives et une adaptation dynamique des grappes de calculateurs offrent des opportunités importantes d'économiser l'énergie dans les grappes de calculateurs actuelles. Étant donnée la nature hétérogène de ces environnements, il est nécessaire de fournir des gestionnaires de consolidation passant à l'échelle, tolérants aux fautes, et distribués, afin de gérer efficacement les charges applicatives de ces grappes et ainsi économiser l'énergie et réduire les coûts opérationnels. Cependant, la plupart des gestionnaires de consolidation disponibles de nos jours ne satisfont pas ces critères. Ainsi, ces gestionnaires de consolidation sont pour la plupart centralisés et ne sont conçus que pour des environnements virtualisés. Dans ce travail, nous présentons l'architecture d'un nouveau gestionnaire de consolidation passant à l'échelle, tolérant aux fautes, et distribué, appelé Snooze, qui est capable de consolider dynamiquement la charge applicative d'une grappe hétérogène du point de vue logiciel comme du point de vue matériel, de grande taille, et composée de ressources utilisant les technologies de virtualisation et de système à image unique (SSI). Pour cela une API commune pour la supervision et la gestion d'une grappe est présentée. Cette API permet d'accéder de façon uniforme et transparente aux fonctionnalités des plates-formes sous-jacentes. Notre architecture est ouverte afin d'être adaptable aux technologies futures, et peut être étendue aisément avec d'autres métriques et algorithmes de supervision. Enfin, une étude complète de cas d'utilisation montre la faisabilité de notre approche pour gérer la consommation d'énergie d'une grappe de grande taille

INRIA a CCSD electronic archive server

File Mapping as a New Parallel File System Interface

Author: Lottiaux Renaud
Morin Christine
Publication venue: HAL CCSD
Publication date: 01/01/1999
Field of study

Publication in abstract form onlyInternational audienceno abstrac

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Gestion globale de la mémoire physique d'une grappe pour un système à image unique : (mise en oeuvre dans le système Gobelins)

Author: LOTTIAUX Renaud
MORIN Christine
Publication venue
Publication date: 01/01/2001
Field of study

Cette thèse portent sur la conception d'un système d'exploitation dédié aux grappes d'ordinateurs. L'objectif est de fournir un système à image unique au dessus d'une grappe. Pour cela, nous proposons un mécanisme logiciel appelé conteneur, fondé sur une gestion globale de la mémoire physique des noeuds d'une grappe. Ce mécanisme permet de stocker et de partager des données entre les noyaux d'un système d'exploitation hôte. Les conteneurs sont intégrés au sein du système hôte grâce à un ensemble de lieurs, qui sont des éléments logiciels intercalés entre les gestionnaires de périphériques et les services systèmes. Il est ainsi possible de réaliser très simplement une mémoire virtuelle partagée, un système de caches de fichiers coopératifs, un système de gestion de fichiers distribués et de simplifier de manière significative les mécanismes de migration de processus. Un système d'exploitation nommé Gobelins a été réalisé sur la base d'un système Linux afin de valider notre concept.RENNES1-BU Sciences Philo (352382102) / SudocSudocFranceF

OpenGrey Repository

Gestion globale et unifiée des couches mémoires sur une grappe de machines

Author: Lottiaux Renaud
Morin Christine
Publication venue: HAL CCSD
Publication date: 01/01/2000
Field of study

International audienceno abstrac

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1