19 research outputs found

    April-May 2007

    Get PDF

    Deploying Large-Scale Datasets on-Demand in the Cloud: Treats and Tricks on Data Distribution

    Get PDF
    Public clouds have democratised the access to analytics for virtually any institution in the world. Virtual Machines (VMs) can be provisioned on demand, and be used to crunch data after uploading into the VMs. While this task is trivial for a few tens of VMs, it becomes increasingly complex and time consuming when the scale grows to hundreds or thousands of VMs crunching tens or hundreds of TB. Moreover, the elapsed time comes at a price: the cost of provisioning VMs in the cloud and keeping them waiting to load the data. In this paper we present a big data provisioning service that incorporates hierarchical and peer-to-peer data distribution techniques to speed-up data loading into the VMs used for data processing. The system dynamically mutates the sources of the data for the VMs to speed-up data loading. We tested this solution with 1000 VMs and 100 TB of data, reducing time by at least 30 % over current state of the art techniques. This dynamic topology mechanism is tightly coupled with classic declarative machine configuration techniques (the system takes a single high-level declarative configuration file and configures both software and data loading). Together, these two techniques simplify the deployment of big data in the cloud for end users who may not be experts in infrastructure management. Index Terms—Large-scale data transfer, flash crowd, big data, BitTorrent, p2p overlay, provisioning, big data distribution I

    Automated planning for cloud service configurations

    Get PDF
    The declarative approach has been widely accepted as an appropriate way to manage configurations of large scale systems – the administrators describe the specification of the “desired” configuration state of the system, and the tool computes and executes the necessary actions to bring the system from its current state into this desired state. However, none of state-of-the-art declarative configuration tools make any guarantees about the order of the changes across the system involved in implementing configuration changes. This thesis presents a technique that addresses this issue – it uses the SFP language to allow administrators to specify the desired configuration state and the global constraints of the system, compiles the specified reconfiguration task into a classical planning problem, and then uses an automated planning technique to automatically generate the workflow. The execution of the workflow can bring the system into the desired state, while preserving the global constraints during configuration changes. This thesis also presents an alternative approach to deploy the configurations – the workflow is used to automatically choreograph a set of reactive agents which are capable to autonomously reconfigure a computing system into a specified desired state. The agent interactions are guaranteed to be deadlock/livelock free, can preserve pre-specified global constraints during their execution, and automatically maintain the desired state once it has been achieved (self-healing). We present the formal semantics of SFP language, the technique that compiles SFP reconfiguration tasks to classical planning problems, and the algorithms for automatic generation and execution of the reactive agent models. In addition, we also present the formal semantics of core subset of SmartFrog language which is the foundation of SFP. Moreover, we present a domain-independent technique to compile a planning problem with extended goals into a classical planning problem. As a proof of concept, the techniques have been implemented in a prototype configuration tool called Nuri, which has been used to configure typical use-cases in cloud environment. The experiment results demonstrate that the Nuri is capable of planning and deploying the configurations in a reasonable time, with guaranteed constraints on the system throughout reconfiguration process

    Administration d'applications réparties à grande échelle

    Get PDF
    L'administration d'une application est une tâche de plus en plus complexe et coûteuse en ressources humaines et matérielles. Nous nous intéressons dans cette thèse à l’administration dans un contexte de grande échelle. Dans ce contexte particulier, nous disposons généralement de plusieurs entités logicielles qui doivent être déployées et gérées sur une infrastructure matérielle de type grille composée de nombreuses machines géographiquement dispersées. L’administration sur ce type d’infrastructure pose de multiples problèmes d’expressivité liés à la description des éléments à administrer, de performance liés à la charge des processus d'administration et la répartition géographique des sites de la grille, d’hétérogénéité matérielle et logicielle, et de dynamicité (panne, coupure de lien réseau, etc.). Nos contributions portent essentiellement sur les problèmes précédemment cités. Un formalisme de description tenant compte du facteur d’échelle est proposé pour décrire l'infrastructure matérielle et logicielle. Nous proposons la répartition de la charge et la diminution du coût de l’administration en utilisant plusieurs systèmes d’administration et en personnalisant la phase d’installation du déploiement. Enfin nous proposons une gestion de l’hétérogénéité matérielle et logicielle. Le travail de cette thèse s’inscrit dans le cadre du projet TUNe. Nous proposons donc une application et une implantation de ces contributions au système TUNe afin de valider notre approche dans le cadre d'une expérimentation en vraie grandeur. ABSTRACT : Administration of distributed systems is a task increasingly complex and expensive. It consists in to carry out two main activities : deployment and management of application in the process of running. The activity of the deployment is subdivided into several activities : description of hardware and software, conguration, installation and starting the application. This thesis work focuses on large-scale administration which consists to deploy and manage a distributed legacy application composed of several thousands of software entities on physical infrastructure grid made up of hundreds or thousands of machines. The administration of this type of infrastructure creates many problems of expressiveness, performance, heterogeneity and dynamicity (breakdown of machine, network, ...). These problems are generally caused to the scale and geographical distribution of the sites (set of clusters). This thesis contributes to resolve the problems previously cited. Therefore, we propose higher-level descriptions formalisms to describe the structure of hardware and software infrastructures. To reduce the load and increase the performance of the administration, we propose to distribute the deployment system in a hierarchical way in order to distribute the load. The work of this thesis comes the scope of the TUNe (autonomic management system) project. Therefore, we propose to hierarchize TUNe in order to adapt in the context of large-scale administration. We showhow to describe the hierarchy of systems. We also show how to take into account the specicity of the hardware infrastructure at the time of deployment notably the topology, characteristics and types of machine. We dene a process langage allowing to describe the process installation which allow managers to dene thier own installation constraints, according to their needs and preferences. We explore the management of heterogeneity during deployment. Finaly our prototype is validated by an implementation and in the context of a real experimentation

    Intention-oriented programming support for runtime adaptive autonomic cloud-based applications

    Get PDF
    The continuing high rate of advances in information and communication systems technology creates many new commercial opportunities but also engenders a range of new technical challenges around maximising systems' dependability, availability, adaptability, and auditability. These challenges are under active research, with notable progress made in the support for dependable software design and management. Runtime support, however, is still in its infancy and requires further research. This paper focuses on a requirements model for the runtime execution and control of an intention-oriented Cloud-Based Application. Thus, a novel requirements modelling process referred to as Provision, Assurance and Auditing, and an associated framework are defined and developed where a given system's non/functional requirements are modelled in terms of intentions and encoded in a standard open mark-up language. An autonomic intention-oriented programming model, using the Neptune language, then handles its deployment and execution. © 2013 Elsevier Ltd. All rights reserved

    Déploiement de systèmes répartis multi-échelles : processus, langage et outils intergiciels

    Get PDF
    Avec la multiplication des objets connectés, les systèmes multi-échelles sont de plus en plus répandus. Ces systèmes sont fortement répartis, hétérogènes, dynamiques et ouverts. Ils peuvent être composés de centaines de composants logiciels déployés sur des milliers d'appareils. Le déploiement est un processus complexe qui a pour objectif la mise à disposition puis le maintien en condition opérationnelle d'un système logiciel. Pour les systèmes multi-échelles, l'expression du plan de déploiement ainsi que la réalisation et la gestion du déploiement sont des tâches humainement impossibles du fait de l'hétérogénéité, de la dynamique, du nombre, mais aussi parce que le domaine de déploiement n'est pas forcément connu à l'avance. L'objectif de cette thèse est d'étudier et de proposer des solutions pour le déploiement de systèmes répartis multi-échelles. Nous proposons tout d'abord une mise à jour du vocabulaire relatif au déploiement, ainsi qu'un état de l'art sur le déploiement automatique des systèmes logiciels répartis. Le reste de la contribution réside dans la proposition : d'un processus complet pour le déploiement autonomique de systèmes multi-échelles ; d'un langage dédié (DSL), MuScADeL, qui simplifie la tâche du concepteur du déploiement et permet d'exprimer les propriétés de déploiement ainsi que des informations concernant la perception de l'état du domaine de déploiement ; d'un middleware, MuScADeM, qui assure la génération automatique d'un plan de déploiement en fonction de l'état du domaine, sa réalisation puis le maintien en condition opérationnelle du système.Due to increased connected objects, multiscale systems are more and more widespread. Those systems are highly distributed, heterogeneous, dynamic and open. They can be composed of hundreds of software components deployed into thousands of devices. Deployment of software systems is a complex post-production process that consists in making software available for use and then keeping it operational. For multiscale systems, deployment plan expression just as deployment realization and management are tasks impossible for a human stakeholder because of heterogeneity, dynamics, number, and also because the deployment domain is not necessarily known in advance. The purpose of this thesis is to study and propose solutions for the deployment of distributed multiscale software systems. Firstly, we provide an up-to-date terminology and definitions related to software deployment, plus a state of the art on automatic deployment of distributed software systems. The rest of the contribution lies in the proposition of: a complete process for autonomic deployment of multiscale systems ; a domain specific language, MuScADeL, which simplifies the deployment conceptor task and allows the expression of deployment properties such as informations for the domain state probing ; and a middleware, MuScADeM, which insures the automatic generation of a deployment plan according the domain state, its realization and finally the maintenance in an operational condition of the system

    Research challenges in nextgen service orchestration

    Get PDF
    Fog/edge computing, function as a service, and programmable infrastructures, like software-defined networking or network function virtualisation, are becoming ubiquitously used in modern Information Technology infrastructures. These technologies change the characteristics and capabilities of the underlying computational substrate where services run (e.g. higher volatility, scarcer computational power, or programmability). As a consequence, the nature of the services that can be run on them changes too (smaller codebases, more fragmented state, etc.). These changes bring new requirements for service orchestrators, which need to evolve so as to support new scenarios where a close interaction between service and infrastructure becomes essential to deliver a seamless user experience. Here, we present the challenges brought forward by this new breed of technologies and where current orchestration techniques stand with regards to the new challenges. We also present a set of promising technologies that can help tame this brave new world
    corecore