Search CORE

On Correlated Availability in Internet Distributed Systems

Author: Anderson David P.
Andrzejak Artur
Kondo Derrick
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

International audienceAs computer networks rapidly increase in size and speed, Internet-distributed systems such as P2P, volunteer computing, and Grid systems are increasingly common. A precise and accurate characterization of Internet resources is important for the design and evaluation of such Internet-distributed systems, yet our picture of the Internet landscape is not perfectly clear. To improve this picture, we measure and characterize the time dynamics of availability in a large-scale Internet-distributed system with over 110,000 hosts. Our characterization focuses on identifying patterns of correlated availability. We determine scalable and accurate clustering techniques and distance metrics for automatically detecting significant availability patterns. By means of clustering, we identify groups of resources with correlated availability that exhibit similar time effects. Then we show how these correlated clusters of resources can be used to improve resource management for parallel applications in the context of volunteer computing

Decision Model for Cloud Computing under SLA Constraints

Author: Andrzejak Artur
Kondo Derrick
Yi Sangho
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

International audienceWith the recent introduction of Spot Instances in the Amazon Elastic Compute Cloud (EC2), users can bid for resources and thus control the balance of reliability versus monetary costs. A critical challenge is to determine bid prices that minimize monetary costs for a user while meeting Service Level Agreement (SLA) constraints (for example, sufficient resource availability to complete a computation within a desired deadline). We propose a probabilistic model for the optimization of monetary costs, performance, and reliability, given user and application requirements and dynamic conditions. Using real instance price traces and workload models, we evaluate our model and demonstrate how users should bid optimally on Spot Instances to reach different objectives with desired levels of confidence

Decision Model for Cloud Computing under SLA Constraints

Author: Andrzejak Artur
Kondo Derrick
Yi Sangho
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

With the recent introduction of Spot Instances in the Amazon Elastic Compute Cloud (EC2), users can bid for resources and thus control the balance of reliability versus monetary costs. A critical challenge is to determine bid prices that minimize monetary costs for a user while meeting Service Level Agreement (SLA) constraints (for example, sufﬁcient re- source availability to complete a computation within a desired deadline). We propose a probabilistic model for the optimization of monetary costs, performance, and reliability, given user and application requirements and dynamic conditions. Using real instance price traces and workload models, we evaluate our model and demonstrate how users should bid optimally on Spot Instances to reach different objectives with desired levels of conﬁdence

Intermediate QoS Prototype for the EDGI Infrastructure

Author: Araujo Filipe
Delamare Simon
Fedak Gilles
Kacsuk Péter
Kondo Derrick
Kovacs Jozsef
Lodygensky Oleg
Publication venue: HAL CCSD
Publication date: 02/05/2013
Field of study

This document provides the first deliverable of EDGI JRA2. It is produced by the INRIA team, the SZTAKI team, the LAL/IN2P3 team and the University of Coimbra team. This document aims at describing achievements and results of JRA2 tasks "Advanced QoS Scheduler and Oracle" and "Support In Science Gateway". Hybrid Distributed Computing Infrastructures (DCIs) allow users to combine Grids, Desktop Grids, Clouds, etc. to obtain for their users large computing capabilities. The EDGI infrastructure belongs to this kind of DCIs. The document presents the SpeQuloS framework to provide quality of service (QoS) for application executed on the EDGI infrastructure. It also introduces EDGI QoS portal, an user-friendly and integrated access to QoS features for users of EDGI infrastructure. In this document, we first introduce new results from JRA2.1 task, which collected and analyzed batch execution on Desktop Grid. Then, we present the advanced Cloud Scheduling and Oracle strategies designed inside the SpeQuloS framework (task JRA2.2). We demonstrate efficiency of these strategies using performance evaluation carried out with simulations. Next, we introduce Credit System architecture and QoS user portal as part of the JRA2 Support In Science Gateway (task JRA2.3). Finally, we conclude and provide references to JRA2 production.Ce document fournit le premier livrable pour la tâche JRA2 du projet européen European Desktop Grid Initiative (FP7 EDGI). Il est produit par les équipes de l'INRIA, de SZTAKI, du LAL/IN2P3 et de l'Université de Coimbra. Ce document vise à décrire les réalisations et les résultats qui concernent la qualité de service pour l'infrastructure de grilles de PCs européenne EDGI

HAL-ENS-LYON

HAL-IN2P3

Hal-Diderot

On Correlated Availability in Internet Distributed Systems

Author: Anderson David P.
Andrzejak Artur
Kondo Derrick
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

CiteSeerX

Characterizing Result Errors in Internet Desktop Grids

Author: Araujo Filipe
Cappello Franck
Domingues Patricio
Fedak Gilles
Kondo Derrick
Malecot Paul
Silva Luis Moura
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

Desktop grids use the free resources in Intranet and Internet environments for large-scale computation and storage. While desktop grids offer a high return on investment, one critical issue is the validation of results returned by participating hosts. Several mechanisms for result validation have been previously proposed. However, the characterization of errors is poorly understood. To study error rates, we implemented and deployed a desktop grid application across several thousand hosts distributed over the Internet. We then analyzed the results to give quantitative, empirical characterization of errors rates. We find that in practice, error rates are widespread across hosts but occur relatively infrequently. Moreover, we find that error rates tend to not be stationary over time nor correlated between hosts. In light of these characterization results, we evaluated state-of-the-art error detection mechanisms and describe the trade-offs for using each mechanism. Finally, based on our empirical results, we conduct a benefit analysis of a recently proposed mechanism for error detection tailored for long-running applications. This mechanism is based on using the digest of intermediate checkpoints, and we show in theory and simulation that the relative benefit of this method compared to the state-of-the-art is as high as 45\%

HAL-CentraleSupelec

HAL - Lille 3

Mining for Availability Models in Large-Scale Distributed Systems:A Case Study of SETI@home

Author: Anderson David P.
Javadi Bahman
Kondo Derrick
Vincent Jean-Marc
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

In the age of cloud, Grid, P2P, and volunteer distributed computing, large-scale systems with tens of thousands of unreliable hosts are increasingly common. Invariably, these systems are composed of heterogeneous hosts whose individual availability often exhibit different statistical properties (for example stationary versus non-stationary behaviour) and fit different models (for example Exponential, Weibull, or Pareto probability distributions). In this paper, we describe an effective method for discovering subsets of hosts whose availability have similar statistical properties and can be modelled with similar probability distributions. We apply this method with about 230,000 host availability traces obtained from a real large-scale Internet-distributed system, namely SETI@home. We find that about 34% of hosts exhibit availability that is a truly random process, and that these hosts can often be modelled accurately with a few distinct distributions from different families. We believe that this characterization is fundamental in the design of stochastic scheduling algorithms across large-scale systems where host availability is uncertain

DSL-Lab: a Low-power Lightweight Platform to Experiment on Domestic Broadband Internet

Author: Fedak Gilles
Gelas Jean-Patrick
Herault Thomas
Iniesta Victor
Kondo Derrick
Lefèvre Laurent
Malecot Paul
Nussbaum Lucas
Rezmerita Ala
Richard Olivier
Publication venue: HAL CCSD
Publication date: 07/07/2010
Field of study

International audienceThis article presents the design and building of DSL-Lab, a platform to experiment on distributed computing over broadband domestic Internet. Experimental platforms such as PlanetLab and Grid'5000 are promising methodological approaches to study distributed systems. However, both platforms focus on high-end service and network deployments only available on a restricted part of the Internet, leaving aside the possibility for researchers to experiment in conditions close to what is usually available with domestic connection to the Internet. DSL-Lab is a complementary approach to PlanetLab and Grid'5000 to experiment with distributed computing in an environment closer to how Internet appears, when applications are run on end-user PCs. DSL-Lab is a set of 40 low-power and low-noise nodes, which are hosted by participants, using the participants' xDSL or cable access to the Internet. The objective is to provide a validation and experimentation platform for new protocols, services, simulators and emulators for these systems. In this paper, we report on the software design (security, resources allocation, power management) as well as on the first experiments achieved

Hal-Diderot

DSL-Lab: a Platform to Experiment on Domestic Broadband Internet

Author: Fedak Gilles
Gelas Jean-Patrick
Herault Thomas
Iniesta Victor
Kondo Derrick
Lefèvre Laurent
Malecot Paul
Nussbaum Lucas
Rezmerita Ala
Richard Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

This report presents the design and building of DSL-Lab, a platform for distributed computing and peer-to-peer experiments over the domestic broadband Internet. Experimental platforms such as PlanetLab and Grid'5000 are promising methodological approaches for studying distributed systems. However, both platforms focus on high-end services and network deployments on only a restricted part of the Internet, and as such, they do not provide experimental conditions of residential broadband networks. DSL-Lab is composed of 40 low-power and noiseless nodes, which are hosted by participants, using users' xDSL or cable access to the Internet. The objective is twofold: 1) to provide accurate and customized measures of availability, activity and performance in order to characterize and tune the models of such resources~; 2) to provide an experimental platform for new protocols, services and applications, as well as a validation tool for simulators and emulators targeting these systems. In this article, we report on the software infrastructure (security, resources allocation, power management) as well as on the first results and experiments achieved

HAL-ENS-LYON

HAL-CentraleSupelec

HAL - Lille 3

HAL Descartes

Hal-Diderot