47 research outputs found

    Containers : A Sound Basis For a True Single System Image

    Get PDF
    Clusters of SMPs are attractive for executing shared memory parallel applications but reconciling high performance and ease of programming remains an open issue. A possible approach is to provide an efficient Single System Image (SSI) operating system giving the illusion of an SMP machine. In this paper, we introduce the concept of container as a mechanism to unify global resource management at the lowest operating system level. Higher level operating system services such as virtual memory system and file cache can be easily implemented based on containers and transparently take benefit of the whole memory resource available in the cluster

    Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

    Get PDF
    International audienceIntelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster

    Ghost Process: a Sound Basis to Implement Process Duplication, Migration and Checkpoint/Restart in Linux Clusters

    Get PDF
    Process management mechanisms (process duplication, migration and checkpoint/restart) are very useful for high performance and high availability in clustering systems. The single system image approach aims at providing a global process management service with mechanisms for process checkpoint, process migration and process duplication. In this context, a common mechanism for process virtualization is highly desirable but traditional operating systems do not provide such a mecahnism. This paper presents a kernel service for process virtualization called ghost process, extending the Linux kernel. The ghost process mechanism has been implemented in the Kerrighed single system image based on Linux. \\ Les mécanismes de gestion de processus (duplication, migration et création de point de reprise/reprise de processus) sont particuliÚrement intéressants pour les systÚmes pour grappes de calculateurs à haute performance et à haute disponibilité. L'un des buts des systÚmes à image unique est d'offrir un service de gestion globale des processus fondé sur des mécamismes de création de points de reprise de processus, de migration de processus et de duplication de processus. Dans ce contexte, un mécanisme commun pour la virtualisation de processus est hautement bénéfique mais les systÚmes d'exploitation traditionnels n'offrent pas un tel mécanisme. Ce document présente un service noyau pour la virtualisation de processus, appelé processus fantÎme, fondé sur une extension du noyau Linux. Le mécanisme de processus fantÎme a été mis en oeuvre dans le systÚme à image unique Kerrughed fondé sur Linux

    Kerrighed: A SSI Cluster OS Running OpenMP

    Get PDF
    Writing parallel programs for clusters of workstations is still a challenging task. In this paper, we present Kerrighed, a Single System Image (SSI) operating system giving the illusion of an SMP machine, and providing the standard posix thread interface to developers. It is therefore possible to use Kerrighed to run OpenMP programs compiled for SMP-machines using the posix thread interface. In this paper, we explain how we managed to achieve that goal, and present the benefits of providing OpenMP support through the SSI approach as opposed to a dedicated run-time environment

    OpenMosix, OpenSSI and Kerrighed: A Comparative Study

    Get PDF
    This paper presents a comparative study of Kerrighed, openMosix and OpenSSI, three Single System Image (SSI) operating systems for clusters. This experimental study gives an overview of SSI features offered by these SSI and evaluates performance of such features

    Reducing Kernel Development Complexity In Distributed Environments

    Get PDF
    Setting up generic and fully transparent distributed services for clusters implies complex and tedious kernel developments. More flexible approaches such as user-space libraries are usually preferred with the drawback of requiring application recompilation. A second approach consists in using specific kernel modules (such as FUSE in Gnu/Linux system) to transfer kernel complexity into user space. In this paper, we present a new way to design and implement kernel distributed services for clusters by using a cluster wide consistent data management service. This system, entitled kDDM for "kernel Distributed Data Management", offers flexible kernel mechanisms to transparently manage remote accesses, cache and coherency. We show how kDDM simplifies distributed kernel developments by presenting the design and the implementation of a service as complex as a fully symmetric distributed file system. The innovative approach of kDDM has the potential to boost the development of distributed kernel services because it relieves the developers of the burden of dealing with distributed protocols and explicit data transfers. Instead, it allows focusing on the implementation of services in a manner very similar to that of parallel programming on SMP systems. More generally, the use of kDDM could be exploited in almost all local kernel services to extend them to cluster scale. Cluster wide IPC, distributed namespaces (such as /proc) or process migration are some potential examples

    Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

    Get PDF
    Intelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster.Une consolidation intelligente des charges applicatives et une adaptation dynamique des grappes de calculateurs offrent des opportunitĂ©s importantes d'Ă©conomiser l'Ă©nergie dans les grappes de calculateurs actuelles. Étant donnĂ©e la nature hĂ©tĂ©rogĂšne de ces environnements, il est nĂ©cessaire de fournir des gestionnaires de consolidation passant Ă  l'Ă©chelle, tolĂ©rants aux fautes, et distribuĂ©s, afin de gĂ©rer efficacement les charges applicatives de ces grappes et ainsi Ă©conomiser l'Ă©nergie et rĂ©duire les coĂ»ts opĂ©rationnels. Cependant, la plupart des gestionnaires de consolidation disponibles de nos jours ne satisfont pas ces critĂšres. Ainsi, ces gestionnaires de consolidation sont pour la plupart centralisĂ©s et ne sont conçus que pour des environnements virtualisĂ©s. Dans ce travail, nous prĂ©sentons l'architecture d'un nouveau gestionnaire de consolidation passant Ă  l'Ă©chelle, tolĂ©rant aux fautes, et distribuĂ©, appelĂ© Snooze, qui est capable de consolider dynamiquement la charge applicative d'une grappe hĂ©tĂ©rogĂšne du point de vue logiciel comme du point de vue matĂ©riel, de grande taille, et composĂ©e de ressources utilisant les technologies de virtualisation et de systĂšme Ă  image unique (SSI). Pour cela une API commune pour la supervision et la gestion d'une grappe est prĂ©sentĂ©e. Cette API permet d'accĂ©der de façon uniforme et transparente aux fonctionnalitĂ©s des plates-formes sous-jacentes. Notre architecture est ouverte afin d'ĂȘtre adaptable aux technologies futures, et peut ĂȘtre Ă©tendue aisĂ©ment avec d'autres mĂ©triques et algorithmes de supervision. Enfin, une Ă©tude complĂšte de cas d'utilisation montre la faisabilitĂ© de notre approche pour gĂ©rer la consommation d'Ă©nergie d'une grappe de grande taille

    File Mapping as a New Parallel File System Interface

    No full text
    Publication in abstract form onlyInternational audienceno abstrac

    Gestion globale de la mémoire physique d'une grappe pour un systÚme à image unique : (mise en oeuvre dans le systÚme Gobelins)

    No full text
    Cette thÚse portent sur la conception d'un systÚme d'exploitation dédié aux grappes d'ordinateurs. L'objectif est de fournir un systÚme à image unique au dessus d'une grappe. Pour cela, nous proposons un mécanisme logiciel appelé conteneur, fondé sur une gestion globale de la mémoire physique des noeuds d'une grappe. Ce mécanisme permet de stocker et de partager des données entre les noyaux d'un systÚme d'exploitation hÎte. Les conteneurs sont intégrés au sein du systÚme hÎte grùce à un ensemble de lieurs, qui sont des éléments logiciels intercalés entre les gestionnaires de périphériques et les services systÚmes. Il est ainsi possible de réaliser trÚs simplement une mémoire virtuelle partagée, un systÚme de caches de fichiers coopératifs, un systÚme de gestion de fichiers distribués et de simplifier de maniÚre significative les mécanismes de migration de processus. Un systÚme d'exploitation nommé Gobelins a été réalisé sur la base d'un systÚme Linux afin de valider notre concept.RENNES1-BU Sciences Philo (352382102) / SudocSudocFranceF
    corecore