Search CORE

Memory and Network Aware Scheduling of Virtual Machine Migrations

Author: Hermenier Fabien
Kherbache Vincent
Madelaine Eric
Publication venue: HAL CCSD
Publication date: 21/04/2015
Field of study

International audienceLive-migration has become a common operation on virtualized infrastructures. Indeed, it is widely used by resource management algorithms to distribute the load between servers and to reduce energy consumption. Operators rely also on migrations to prepare production servers for critical maintenance by relocating their running VMs elsewhere. To apply new VM placement decisions, live-migrations must be scheduled by selecting for each migration the moment to start and the bandwidth to allocate. Long migrations violate SLAs and reduce the practical benefits of placement algorithms. The VMs should then be migrated as fast as possible. To do so, the migration scheduler must be able to predict accurately the migration durations and schedule them accordingly. Dynamic VM placement algorithms focus extensively on computing a placement of quality. Their practical reactivity is however lowered by restrictive assumptions that underestimate the migration durations. For example, Entropy supposes a non-blocking homogeneous network coupled with a null dirty page rate and we already demonstrated that the network topology but also the workload live memory usage are dominating factors. Recently, some migration models have been developed and integrated into simulators to evaluate VM placement algorithms properly. While these models reproduce migrations finely, they are only devoted to simulation purpose and not used to compute scheduling decisions. We propose here a migration scheduler that considers the network topology, the migration routes, the VM memory usage and the dirty page rates, to compute precise migration durations and infer better schedules. We implemented our scheduler on top of BtrPlace, an extensible version of Entropy that allows to enrich the scheduling decision capabilities through plug-ins. To assess the flexibility of our scheduler, we also implemented constraints to synchronize migrations, to establish precedence rules, to respect power budgets and an objective that minimizes energy consumption. We evaluated our model accuracy and its resulting benefits by executing migration scenarios on a real testbed including a blocking network, mixed VM memory workloads and collocation settings. Our model predicted the migration durations with a 94% accuracy at minimum and an absolute error of 1 second while BtrPlace vanilla was only 30% accurate. This gain of precision led to wiser scheduling decisions. In practice, the migrations completed on average 3.5 time faster as compared to an execution based on BtrPlace vanilla. Thanks to a better control of migrations and power-switching actions we also reduced the power consumption of a server decommissioning scenario according to different power budgets

HAL: Hyper Article en Ligne

Scheduling Live-Migrations for Fast, Adaptable and Energy-Efficient Relocation Operations

Author: Hermenier Fabien
Kherbache Vincent
Madelaine Eric
Publication venue: HAL CCSD
Publication date: 07/12/2015
Field of study

International audienceEvery day, numerous VMs are migrated inside a datacenter to balance the load, save energy or prepare production servers for maintenance. Despite VM placement problems are carefully studied, the underlying migration scheduler rely on vague adhoc models. This leads to unnecessarily long and energy-intensive migrations. We present mVM, a new and extensible migration scheduler. mVM takes into account the VM memory workload and the network topology to estimate precisely the migration duration and take wiser scheduling decisions. mVM is implemented as a plugin of BtrPlace and can be customized with additional scheduling constraints to finely control the migrations. Experiments on a real testbed show mVM outperforms schedulers that cap the migration parallelism by a constant to reduce the completion time. Besides an optimal capping, mVM reduces the migration duration by 20.4% on average and the completion time by 28.1%. In a maintenance operation involving 96 VMs to migrate between 72 servers, mVM saves 21.5% Joules against BtrPlace. Finally, its current library of 6 constraints allows administrators to address temporal and energy concerns, for example to adapt the schedule and fit a power budget

HAL: Hyper Article en Ligne

Cluster-Wide Context Switch of Virtualized Jobs

Author: Hermenier Fabien
Lebre Adrien
Menaud Jean-Marc
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceClusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficient for an efficient use of clusters. To provide a finer RMS, job preemption, migration and dynamic allocation of resources are required. However due to the complexity of developing and using such mechanisms, advanced scheduling strategies have rarely been deployed. This trend is currently evolving thanks to the use of migration and preemption capabilities of Virtual Machines (VMs). However, although the manipulation of jobs composed of VM enables to change the state of the jobs according to the scheduling objective, changing the state and the location of numerous VMs at each decision is tedious and degrades the overall performance. In addition to the scheduling policy implementation, developers have to focus on the feasibility of the actions while executing them in the most efficient way. In this paper, we argue such an operation is independent from the policy itself and can be addressed through a generic mechanism, the cluster-wide context switch. Thanks to it, developers can implement sophisticated algorithms to schedule jobs without handling the issues related to their manipulations. They only focus on the implementation of their algorithm to select the jobs to run while the cluster-wide context switch system performs the necessary actions to switch from the current to the new situation. As a proof of concept, we evaluate the interest of the cluster-wide context switch through a sample scheduler that executes jobs as early as possible, even partially, regarding to their current resources requirements and their priority

CiteSeerX

Crossref

Changement de contexte pour tâches virtualisées à l'échelle des grappes

Author: Hermenier Fabien
Lebre Adrien
Menaud Jean-Marc
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

National audienceDe nos jours, la gestion des ressources d'une grappe est effectuée en allouant des tranches de temps aux applications, spécifiées par les utilisateurs et de manière statique. Pour un utilisateur, soit les ressources demandées sont sur-estimées et la grappe est sous-utilisée, soit sous-dimensionnées et ses calculs sont dans la plupart des cas perdus. L'apparition de la virtualisation a apporté une certaine flexibilité quant à la gestion des applications et des ressources des grappes. Cependant, pour optimiser l'utilisation de ces ressource, et libérer les utilisateurs d'estimations hasardeuses, il devient nécessaire d'allouer dynamiquement les ressources en fonction des besoins réels des applications : Être capable de démarrer dynamiquement une application lorsqu'une ressource se libère ou la suspendre lorsque la ressource doit être ré-attribuée. En d'autres termes, être capable de développer un système comparable au changement de contexte sur les ordinateurs standards pour les applications s'exécutant sur une grappe. En s'appuyant sur la virtualisation, développer un tel mécanisme de manière générique devient envisageable. Dans cet article nous proposons une infrastructure offrant la notion de changement de contexte d'applications virtualisées appliquée aux grappes. Cette solution a permis de développer un ordonnanceur exécutant simultanément un maximum d'applications virtualisées. Nous montrons qu'une telle solution augmente le taux d'occupation de notre grappe et réduit le temps de traitement des applications

Ordonnancement contrôlé de migrations à chaud

Author: Hermenier Fabien
Kherbache Vincent
Madelaine Eric
Publication venue: HAL CCSD
Publication date: 01/07/2015
Field of study

National audienceMigrer à chaud une machine virtuelle (VM) est une opération basique dans un centre de don-nées. Tous les jours, des VM sont migrées pour répartir la charge, économiser de l'énergie ou préparer la maintenance de serveurs en production. Bien que les problèmes de placement des VM soient beaucoup étudiés, on observe que la gestion des migrations permettant de transiter vers ces nouveaux placements reste un domaine de second plan. On observe alors des algo-rithmes de placement de qualité, couplés à des algorithmes d'ordonnancement prenant des décisions peu pertinentes causées par des hypothèses irréalistes. Nous présentons dans ce papier mVM, un ordonnanceur de migrations reposant sur un modèle précis du réseau et du protocole de migration à chaud. Cet ordonnanceur a été intégré en place de celui du gestionnaire de VM BtrPlace. Nos premières expérimentations montrent que les durées des migrations sont estimées à 1.5 secondes près. Cette précision a permis de calculer de meilleurs ordonnancements, réduisant la durée des migrations par 3.5 comparée à BtrPlace

HAL: Hyper Article en Ligne

Entropy: a Consolidation Manager for Clusters

Author: Hermenier Fabien
Lawall Julia
Lorca Xavier
Menaud Jean-Marc
Muller Gilles
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

Clusters provide powerful computing environments, but in practice much of this power goes to waste, due to the static allocation of tasks to nodes, regardless of their changing computational requirements. Consolidation is an approach that migrates tasks within a cluster as their computational requirements change, both to reduce the number of nodes that need to be active and to eliminate temporary overload situations. Previous consolidation strategies have relied on task placement heuristics that use only local optimization and typically do not take migration overhead into account. However, heuristics based on only local optimization may miss the globally optimal solution, resulting in unnecessary resource usage, and the overhead for migration may nullify the benefits of consolidation. In this paper, we propose the Entropy resource manager for homogeneous clusters, which performs consolidation based on constraint programming and takes migration overhead into account. The use of constraint programming allows Entropy to find mappings of tasks to nodes that are better than those found by heuristics based on local optimizations, and that are frequently globally optimal in the number of nodes. Because migration overhead is taken into account, Entropy chooses migrations that can be implemented efficiently, incurring a low performance overhead

La contrainte Increasing NValue

Author: Beldiceanu Nicolas
Hermenier Fabien
Lorca Xavier
Petit Thierry
Publication venue: HAL CCSD
Publication date: 09/06/2010
Field of study

National audienceCet article introduit la contrainte Increasing NValue, qui restreint le nombre de valeurs distinctes affectées à une séquence de variables, de sorte que chaque variable de la séquence soit inférieure ou égale à la variable la succédant immédiatement. Cette contrainte est une spécialisation de la contrainte NValue, motivée par le besoin de casser des symétries. Il est bien connu que propager la contrainte NValue est un problème NP-Difficile. Nous montrons que la spécialisation au cas d'une séquence ordonnée de variables rend le problème polynomial. Nous proposons un algorithme d'arc-consistance ayant une complexité temporelle en O(sum D), où sum D est la somme des tailles des domaines. Cet algorithme est une amélioration significative, en termes de complexité, des algorithmes issus d'une représentation de la contrainte Increasing NValue à l'aide d'automates ou de la contrainte SLIDE. Nous utilisons notre contrainte dans le cadre d'un problème d'allocation de ressources

Almae Matris Studiorum Campus

Planning Live-Migrations to Prepare Servers for Maintenance

Author: Kherbache Vincent
Madelaine Eric
Hermenier Fabien
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

International audienceIn a virtualized data center, server maintenance is a common but still critical operation. A prerequisite is indeed to relocate elsewhere the Virtual Machines (VMs) running on the production servers to prepare them for the maintenance. When the maintenance focuses several servers, this may lead to a costly relocation of several VMs so the migration plan must be chose wisely. This however implies to master numerous human, technical, and economical aspects that play a role in the design of a quality migration plan. In this paper, we study migration plans that can be decided by an operator to prepare for an hardware upgrade or a server refresh on multiple servers. We exhibit performance bottleneck and pitfalls that reduce the plan efficiency. We then discuss and validate possible improvements deduced from the knowledge of the environment peculiarities

Crossref

Dynamic Consolidation of Highly Available Web Applications

Author: Hermenier Fabien
Lawall Julia
Menaud Jean-Marc
Muller Gilles
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

Datacenters provide an economical and practical solution for hosting large scale n-tier Web applications. When scalability and high availability are required, each tier can be implemented as multiple replicas, which can absorb extra load and avoid a single point of failure. Realizing these benefits in practice, however, requires that replicas be assigned to datacenter nodes according to certain placement constraints. To provide the required quality of service to all of the hosted applications, the datacenter must consider of all of their specific constraints. When the constraints are not satisfied, the datacenter must quickly adjust the mappings of applications to nodes, taking all of the applications' constraints into account. This paper presents Plasma, an approach for hosting highly available Web applications, based on dynamic consolidation of virtual machines and placement constraint descriptions. The placement constraint descriptions allow the data- center administrator to describe the datacenter infrastructure and each appli- cation administrator to describe his requirements on the VM placement. Based on the descriptions, Plasma continuously optimizes the placement of the VMs in order to provide the required quality of service. Experiments on simulated configurations show that the Plasma reconfiguration algorithm is able to man- age a datacenter with up to 2000 nodes running 4000 VMs with 800 placement constraints. Real experiments on a small cluster of 8 working nodes running 3 instances of the RUBiS benchmarks with a total of 21 VMs show that con- tinuous consolidation is able to reach 85% of the load of a 21 working nodes cluster.Externaliser l'hébergement d'une application Web n-tiers virtualisée dans un centre de données est une solution économiquement viable. Lorsque l'administrateur de l'application considère les problèmes de haute disponibilité tels que le passage à l'échelle et de tolérance aux pannes, chaque machine virtuelle (VM) embarquant un tiers est répliquée plusieurs fois pour absorber la charge et éviter les points de défaillance. Dans la pratique, ces VM doivent être placées selon des contraintes de placement précises. Pour fournir une qualité de service à toutes les applications hébergées, l'administrateur du centre de données doit considérer toutes leurs contraintes. Lorsque des contraintes de placement ne sont plus satisfaites, les VM alors doivent être ré-agencées au plus vite pour retrouver un placement viable. Ce travail est complexe dans un environnement consolidé où chaque nœud peut héberger plusieurs VM. Cet article présente Plasma, un système autonome pour héberger les VM des applications Web haute-disponibilité dans un centre de données utilisant la consolidation dynamique. Par l'intermédiaire de scripts de configuration, les administrateurs des applications décrivent les contraintes de placement de leur VM tandis que l'administrateur système décrit l'infrastructure du centre de données. Grâce à ces descriptions, Plasma optimise en continu le placement des VM pour fournir la qualité de service attendue. Une évaluation avec des données simulées montre que l'algorithme de reconfiguration de Plasma permet de superviser 2000 nœuds hébergeant 4000 VM selon 800 contraintes de placement. Une évaluation sur une grappe de 8 nœuds exécutant 3 instances de l'application RUBiS sur 21 VM montre que la consolidation fournit par Plasma atteint 85% des performances d'une grappe de 21 nœuds

CiteSeerX

HAL Descartes