Search CORE

718 research outputs found

Utility-based Reinforcement Learning for Reactive Grids

Author: Germain-Renaud Cécile
Kégl Balázs
Loomis C.
Perez Julien
Publication venue: HAL CCSD
Publication date: 01/05/2008
Field of study

International audienceLarge scale production grids are an important case for autonomic computing. They follow a mutualization paradigm: decision-making (human or automatic) is distributed and largely independent, and, at the same time, it must implement the highlevel goals of the grid management. This paper deals with the scheduling problem with two partially conflicting goals: fairshare and Quality of Service (QoS). Fair sharing is a wellknown issue motivated by return on investment for participating institutions. Differentiated QoS has emerged as an important and unexpected requirement in the current usage of production grids. In the framework of the EGEE grid (one of the largest existing grids), applications from diverse scientific communities require a pseudo-interactive response time. More generally, seamless integration of the grid power into everyday use calls for unplanned and interactive access to grid resources, which defines reactive grids. The major result of this paper is that the combination of utility functions and reinforcement learning (RL) provides a general and efficient method for dynamically allocating grid resources in order to satisfy both end users with differentiated requirements and participating institutions. Combining RL methods and utility functions for resource allocation was pioneered by Tesauro and Vengerov. While the application contexts are different, the resource allocation issues are very similar. The main difference in our work is that we consider a multi-criteria optimization problem that includes a fair-share objective. A first contribution of our work is the definition of a set of variables describing states and actions that allows us to formulate the grid scheduling problem as a continuous action-state space reinforcement learning problem. To capture the immediate goals of end users and the long-term objectives of administrators, we propose automatically derived utility functions. Finally, our experimental results on a synthetic workload and a real EGEE trace show that RL clearly outperforms the classical schedulers, so it is a realistic alternative to empirical scheduler design

HAL-CentraleSupelec

HAL-IN2P3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Flexible Grid service management through resource partitioning

Author: De Leenheer Marc
De Turck Filip
Demeester Piet
Dhoedt Bart
Thysebaert Pieter
Volckaert Bruno
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

Ghent University Academic Bibliography

A service broker for Intercloud computing

Author: Jrad Foued
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

This thesis aims at assisting users in finding the most suitable Cloud resources taking into account their functional and non-functional SLA requirements. A key feature of the work is a Cloud service broker acting as mediator between consumers and Clouds. The research involves the implementation and evaluation of two SLA-aware match-making algorithms by use of a simulation environment. The work investigates also the optimal deployment of Multi-Cloud workflows on Intercloud environments

KITopen

Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine

Author: Cao Jianneng
Madsen Kasper Grud Skat
Zhou Yongluan
Publication venue
Publication date: 11/02/2016
Field of study

Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches

arXiv.org e-Print Archive

Crossref

University of Southern Denmark Research Output

Resource management of replicated service systems provisioned in the cloud

Author: Binder Walter
Björkqvist Mathias
Publication venue
Publication date: 08/06/2015
Field of study

Service providers seek scalable and cost-effective cloud solutions for hosting their applications. Despite significant recent advances facilitating the deployment and management of services on cloud platforms, a number of challenges still remain. Service providers are confronted with time-varying requests for the provided applications, inter- dependencies between different components, performance variability of the procured virtual resources, and cost structures that differ from conventional data centers. Moreover, fulfilling service level agreements, such as the throughput and response time percentiles, becomes of paramount importance for ensuring business advantages.In this thesis, we explore service provisioning in clouds from multiple points of view. The aim is to best provide service replicas in the form of VMs to various service applications, such that their tail throughput and tail response times, as well as resource utilization, meet the service level agreements in the most cost effective manner. In particular, we develop models, algorithms and replication strategies that consider multi-tier composed services provisioned in clouds. We also investigate how a service provider can opportunistically take advantage of observed performance variability in the cloud. Finally, we provide means of guaranteeing tail throughput and response times in the face of performance variability of VMs, using Markov chain modeling and large deviation theory. We employ methods from analytical modeling, event-driven simulations and experiments. Overall, this thesis provides not only a multi-faceted approach to exploring several crucial aspects of hosting services in clouds, i.e., cost, tail throughput, and tail response times, but our proposed resource management strategies are also rigorously validated via trace-driven simulation and extensive experiment

RERO DOC Digital Library

A load-sharing architecture for high performance optimistic simulations on multi-core machines

Author: Pellegrini Alessandro
Quaglia Francesco
Vitali Roberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well

Crossref

ART

Archivio della ricerca- Università di Roma La Sapienza