29 research outputs found

    An Algorithm for File Transfer Scheduling in Grid Environments

    Get PDF
    This paper addresses the data transfer scheduling problem for Grid environments, presenting a centralized scheduler developed with dynamic and adaptive features. The algorithm offers a reservation system for user transfer requests that allocates them transfer times and bandwidth, according to the network topology and the constraints the user specified for the requests. This paper presents the projects related to the data transfer field, the design of the framework for which the scheduler was built, the main features of the scheduler, the steps for transfer requests rescheduling and two tests that illustrate the system's behavior for different types of transfer requests.Comment: Proceedings of the International Workshop on High Performance Grid Middleware (HiPerGrid), pp. 33-40, Bucharest, Romania, 21-22 November, 2008. (ISSN: 2065-0701

    Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid'5000 Testbed

    Get PDF
    International audienceAlmost ten years after its premises, the Grid'5000 platform has become one of the most complete testbeds for designing or evaluating large-scale distributed systems. Initially dedicated to the study of High Performance Computing, the infrastructure has evolved to address wider concerns related to Desktop Computing, the Internet of Services and more recently the Cloud Computing paradigm. In this paper, we present the latest mechanisms we designed to enable the automated deployment of the major open-source IaaS cloudkits (i.e., Nimbus, OpenNebula, CloudStack, and OpenStack) on Grid'5000. Providing automatic, isolated and reproducible deployments of cloud environments lets end-users study and compare each solution or simply leverage one of them to perform higher-level cloud experiments (such as investigating Map/Reduce frameworks or applications)

    Bringing Introspection Into the BlobSeer Data-Management System Using the MonALISA Distributed Monitoring Framework

    Get PDF
    Held in conjunction with CISIS 2010 ConferenceInternational audienceIntrospection is the prerequisite of an autonomic behavior, the ïŹrst step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider speciïŹc information for higher-level services. More specifically, in the context of data-intensive applications, a speciïŹc introspection layer is required in order to collect data about the usage of storage resources, about data access patterns, etc. This paper discusses the requirements for an introspection layer in a data-management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. This approach has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and the behavior of the system

    Bringing Introspection into BlobSeer: Towards a Self-Adaptive Distributed Data Management System

    Get PDF
    International audienceIntrospection is the prerequisite of an autonomic behavior, the first step towards a performance improvement and a resource-usage optimization for large-scale distributed systems. In Grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, about data access patterns, etc. This paper discusses the requirements for an introspection layer in a data-management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious clients detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and the behavior of the system

    Adding Virtualization Capabilities to Grid'5000

    Get PDF
    Ce rapport rĂ©visĂ© a fait l'objet d'une publication, voir hal-00946971Almost ten years after its premises, the Grid'5000 testbed has become one of the most complete testbed for designing or evaluating large-scale distributed systems. Initially dedicated to the study of High Performance Computing, the infrastructure has evolved to address wider concerns related to Desktop Computing, the Internet of Services and more recently the Cloud Computing paradigm. This report present recent improvements of the Grid'5000 software and services stack to support large-scale experiments using virtualization technologies as building blocks. Such contributions include the deployment of customized software environments, the reservation of dedicated network domain and the possibility to isolate them from the others, and the automation of experiments with a REST API. We illustrate the interest of these contributions by describing three different use-cases of large-scale experiments on the Grid'5000 testbed. The first one leverages virtual machines to conduct larger experiments spread over 4000 peers. The second one describes the deployment of 10000 KVM instances over 4 Grid'5000 sites. Finally, the last use case introduces a one-click deployment tool to easily deploy major IaaS solutions. The conclusion highlights some important challenges of Grid'5000 related to the use of OpenFlow and to the management of applications dealing with tremendous amount of data.Dix ans environ aprĂšs ses prĂ©misses, la plate-forme Grid'5000 est devenue une des plates-formes les plus complĂštes utilisĂ©e pour la conception et l'Ă©valuation de systĂšmes distribuĂ©s Ă  grande Ă©chelle. DĂ©diĂ©e initialement au calcul Ă  haute performance, l'infrastructure a Ă©voluĂ© pour supporter un ensemble de problĂšmes plus vaste liĂ©s au calcul de type Desktop, l'internet des objets et plus rĂ©cemment l'informatique dans les nuages (aussi appelĂ© Cloud Computing). Ce rapport prĂ©sente les amĂ©liorations rĂ©centes apportĂ©es au logiciels et pile de services pour supporter les expĂ©rimentations Ă  grande Ă©chelle utilisant les technologies de virtualisation comme blocs de base. Nos contributions incluent le dĂ©ploiement d'environnements logiciels customisĂ©s, la rĂ©servation de domaines rĂ©seaux dĂ©diĂ©s et la possibilitĂ© de les isoler entre eux, et l'automatisation des expĂ©rimentations grĂące Ă  une API REST. Nous illustrons l'intĂ©rĂȘt de ces contributions en dĂ©crivant trois expĂ©riences Ă  large Ă©chelle sur la plate-forme Grid'5000. La premiĂšre expĂ©rience utilise des machines virtuelles pour conduire des expĂ©rimentations de grande taille sur 4000 pairs. La seconde expĂ©rience dĂ©crit le dĂ©ploiement de 10000 instances KVM sur 4 sites Grid'5000. Enfin le dernier exemple prĂ©sente un outil de dĂ©ploiement simple pour dĂ©ployer des solutions de Cloud de type IaaS. La conclusion discute de prochains dĂ©fis importants de Grid'5000 liĂ©s Ă  l'utilisation d'OpenFlow et Ă  la gestion d'applications gĂ©rant des grandes masses de donnĂ©es

    Adding Virtualization Capabilities to Grid'5000

    Get PDF
    Ce rapport rĂ©visĂ© a fait l'objet d'une publication, voir hal-00946971Almost ten years after its premises, the Grid'5000 testbed has become one of the most complete testbed for designing or evaluating large-scale distributed systems. Initially dedicated to the study of High Performance Computing, the infrastructure has evolved to address wider concerns related to Desktop Computing, the Internet of Services and more recently the Cloud Computing paradigm. This report present recent improvements of the Grid'5000 software and services stack to support large-scale experiments using virtualization technologies as building blocks. Such contributions include the deployment of customized software environments, the reservation of dedicated network domain and the possibility to isolate them from the others, and the automation of experiments with a REST API. We illustrate the interest of these contributions by describing three different use-cases of large-scale experiments on the Grid'5000 testbed. The first one leverages virtual machines to conduct larger experiments spread over 4000 peers. The second one describes the deployment of 10000 KVM instances over 4 Grid'5000 sites. Finally, the last use case introduces a one-click deployment tool to easily deploy major IaaS solutions. The conclusion highlights some important challenges of Grid'5000 related to the use of OpenFlow and to the management of applications dealing with tremendous amount of data.Dix ans environ aprĂšs ses prĂ©misses, la plate-forme Grid'5000 est devenue une des plates-formes les plus complĂštes utilisĂ©e pour la conception et l'Ă©valuation de systĂšmes distribuĂ©s Ă  grande Ă©chelle. DĂ©diĂ©e initialement au calcul Ă  haute performance, l'infrastructure a Ă©voluĂ© pour supporter un ensemble de problĂšmes plus vaste liĂ©s au calcul de type Desktop, l'internet des objets et plus rĂ©cemment l'informatique dans les nuages (aussi appelĂ© Cloud Computing). Ce rapport prĂ©sente les amĂ©liorations rĂ©centes apportĂ©es au logiciels et pile de services pour supporter les expĂ©rimentations Ă  grande Ă©chelle utilisant les technologies de virtualisation comme blocs de base. Nos contributions incluent le dĂ©ploiement d'environnements logiciels customisĂ©s, la rĂ©servation de domaines rĂ©seaux dĂ©diĂ©s et la possibilitĂ© de les isoler entre eux, et l'automatisation des expĂ©rimentations grĂące Ă  une API REST. Nous illustrons l'intĂ©rĂȘt de ces contributions en dĂ©crivant trois expĂ©riences Ă  large Ă©chelle sur la plate-forme Grid'5000. La premiĂšre expĂ©rience utilise des machines virtuelles pour conduire des expĂ©rimentations de grande taille sur 4000 pairs. La seconde expĂ©rience dĂ©crit le dĂ©ploiement de 10000 instances KVM sur 4 sites Grid'5000. Enfin le dernier exemple prĂ©sente un outil de dĂ©ploiement simple pour dĂ©ployer des solutions de Cloud de type IaaS. La conclusion discute de prochains dĂ©fis importants de Grid'5000 liĂ©s Ă  l'utilisation d'OpenFlow et Ă  la gestion d'applications gĂ©rant des grandes masses de donnĂ©es

    Utilisation de BlobSeer pour le stockage de données dans les clouds : auto-adaptation, intégration, évaluation

    No full text
    The emergence of Cloud computing brings forward many challenges that may limit the adoption rate of the Cloud paradigm. As data volumes processed by Cloud applications increase exponentially, designing efficient and secure solutions for data management emerges as a crucial requirement. The goal of this thesis is to enhance a distributed data-management system with self-management capabilities, so that it can meet the requirements of the Cloud storage services in terms of scalability, data availability, reliability and security. Furthermore, we aim at building a Cloud data service both compatible with state-of-the-art Cloud interfaces and able to deliver high-throughput data storage. To meet these goals, we proposed generic self-awareness, self-protection and self-configuration components targeted at distributed data-management systems. We validated them on top of BlobSeer, a large-scale data-management system designed to optimize highly-concurrent data accesses. Next, we devised and implemented a BlobSeer-based file system optimized to efficiently serve as a storage backend for Cloud services. We then integrated it within a real-world Cloud environment, the Nimbus platform. The benefits and drawbacks of using Cloud storage for real-life applications have been emphasized in evaluations that involved data-intensive MapReduce applications and tightly-coupled, high-performance computing applications.L’émergence de l’informatique dans les nuages met en avant de nombreux dĂ©fis qui pourraient limiter l’adoption du paradigme Cloud. Tandis que la taille des donnĂ©es traitĂ©es par les applications Cloud augmente exponentiellement, un dĂ©fi majeur porte sur la conception de solutions efficaces pour la gestion de donnĂ©es. Cette thĂšse a pour but de concevoir des mĂ©canismes d’auto-adaptation pour des systĂšmes de gestion de donnĂ©es, afin qu’ils puissent rĂ©pondre aux exigences des services de stockage Cloud en termes de passage Ă  l’échelle, disponibilitĂ© et sĂ©curitĂ© des donnĂ©es. De plus, nous nous proposons de concevoir un service de donnĂ©es qui soit Ă  la fois compatible avec les interfaces Cloud standard dans et capable d’offrir un stockage de donnĂ©es Ă  haut dĂ©bit. Pour relever ces dĂ©fis, nous avons proposĂ© des mĂ©canismes gĂ©nĂ©riques pour l’auto-connaissance, l’auto-protection et l’auto-configuration des systĂšmes de gestion de donnĂ©es. Ensuite, nous les avons validĂ©s en les intĂ©grant dans le logiciel BlobSeer, un systĂšme de stockage qui optimise les accĂšs hautement concurrents aux donnĂ©es. Finalement, nous avons conçu et implĂ©mentĂ© un systĂšme de fichiers s’appuyant sur BlobSeer, afin d’optimiser ce dernier pour servir efficacement comme support de stockage pour les services Cloud. Puis, nous l’avons intĂ©grĂ© dans un environnement Cloud rĂ©el, la plate-forme Nimbus. Les avantages et les dĂ©savantages de l’utilisation du stockage dans le Cloud pour des applications rĂ©elles sont soulignĂ©s lors des Ă©valuations effectuĂ©es sur Grid’5000. Elles incluent des applications Ă  accĂšs intensif aux donnĂ©es, comme MapReduce, et des applications fortement couplĂ©es, comme les simulations atmosphĂ©riques
    corecore