1,789 research outputs found

    On Correlated Availability in Internet Distributed Systems

    No full text
    International audienceAs computer networks rapidly increase in size and speed, Internet-distributed systems such as P2P, volunteer computing, and Grid systems are increasingly common. A precise and accurate characterization of Internet resources is important for the design and evaluation of such Internet-distributed systems, yet our picture of the Internet landscape is not perfectly clear. To improve this picture, we measure and characterize the time dynamics of availability in a large-scale Internet-distributed system with over 110,000 hosts. Our characterization focuses on identifying patterns of correlated availability. We determine scalable and accurate clustering techniques and distance metrics for automatically detecting significant availability patterns. By means of clustering, we identify groups of resources with correlated availability that exhibit similar time effects. Then we show how these correlated clusters of resources can be used to improve resource management for parallel applications in the context of volunteer computing

    Statistical Modeling of Resource Availability in Desktop Grids

    Get PDF
    Desktop grids are compute platforms that aggregate and harvest the idle CPU cycles of individually owned personal computers and workstations. A challenge for using these platforms is that the compute resources are volatile. Due to this volatility the vast majority of desktop grid applications are embarrassingly parallel and high-throughput. Deeper understanding of the nature of resource availability is needed to enable the use of desktop grids for a broader class of applications. In this document we further this understanding thanks to statistical analysis of availability traces collected on real-world desktop grid platforms

    PFS: A Productivity Forecasting System For Desktop Computers To Improve Grid Applications Performance In Enterprise Desktop Grid

    Get PDF
    An Enterprise Desktop Grid (EDG) is a low cost platform that gathers desktop computers spread over different institutions. This platform uses desktop computers idle time to run Grid applications. We argue that computers in these environments have a predictable productivity that affects a Grid application execution time. In this paper, we propose a system called PFS for computer productivity forecasting that improves Grid applications performance. We simulated 157.500 applications and compared the performance achieved by our proposal against two recent strategies. Our experiments show that a Grid scheduler based on PFS runs applications faster than schedulers based on other selection strategies.Fil: Salinas, Sergio Ariel. Universidad Nacional de Cuyo; ArgentinaFil: Garcia Garino, Carlos Gabriel. Universidad Nacional de Cuyo; ArgentinaFil: Zunino Suarez, Alejandro Octavio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tandil. Instituto Superior de Ingenieria del Software; Argentin

    PFS: A Productivity Forecasting System for Desktop Computers to Improve Grid Applications Performance in Enterprise Desktop Grid

    Get PDF
    An Enterprise Desktop Grid (EDG) is a low cost platform that gathers desktop computers spread over different institutions. This platform uses desktop computers idle time to run Grid applications. We argue that computers in these environments have a predictable productivity that affects a Grid application execution time. In this paper, we propose a system called PFS for computer productivity forecasting that improves Grid applications performance. We simulated 157.500 applications and compared the performance achieved by our proposal against two recent strategies. Our experiments show that a Grid scheduler based on PFS runs applications faster than schedulers based on other selection strategies

    MOON: MapReduce On Opportunistic eNvironments

    Get PDF
    Abstract—MapReduce offers a flexible programming model for processing and generating large data sets on dedicated resources, where only a small fraction of such resources are every unavailable at any given time. In contrast, when MapReduce is run on volunteer computing systems, which opportunistically harness idle desktop computers via frameworks like Condor, it results in poor performance due to the volatility of the resources, in particular, the high rate of node unavailability. Specifically, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. The adaptive task and data scheduling algorithms in MOON distinguish between (1) different types of MapReduce data and (2) different types of node outages in order to strategically place tasks and data on both volatile and dedicated nodes. Our tests demonstrate that MOON can deliver a 3-fold performance improvement to Hadoop in volatile, volunteer computing environments

    Intermediate QoS Prototype for the EDGI Infrastructure

    Get PDF
    This document provides the first deliverable of EDGI JRA2. It is produced by the INRIA team, the SZTAKI team, the LAL/IN2P3 team and the University of Coimbra team. This document aims at describing achievements and results of JRA2 tasks "Advanced QoS Scheduler and Oracle" and "Support In Science Gateway". Hybrid Distributed Computing Infrastructures (DCIs) allow users to combine Grids, Desktop Grids, Clouds, etc. to obtain for their users large computing capabilities. The EDGI infrastructure belongs to this kind of DCIs. The document presents the SpeQuloS framework to provide quality of service (QoS) for application executed on the EDGI infrastructure. It also introduces EDGI QoS portal, an user-friendly and integrated access to QoS features for users of EDGI infrastructure. In this document, we first introduce new results from JRA2.1 task, which collected and analyzed batch execution on Desktop Grid. Then, we present the advanced Cloud Scheduling and Oracle strategies designed inside the SpeQuloS framework (task JRA2.2). We demonstrate efficiency of these strategies using performance evaluation carried out with simulations. Next, we introduce Credit System architecture and QoS user portal as part of the JRA2 Support In Science Gateway (task JRA2.3). Finally, we conclude and provide references to JRA2 production.Ce document fournit le premier livrable pour la tâche JRA2 du projet européen European Desktop Grid Initiative (FP7 EDGI). Il est produit par les équipes de l'INRIA, de SZTAKI, du LAL/IN2P3 et de l'Université de Coimbra. Ce document vise à décrire les réalisations et les résultats qui concernent la qualité de service pour l'infrastructure de grilles de PCs européenne EDGI

    Correlated Resource Models of Internet End Hosts

    Get PDF
    Understanding and modelling resources of Internet end hosts is essential for the design of desktop software and Internet-distributed applications. In this paper we develop a correlated resource model of Internet end hosts based on real trace data taken from the SETI@home project. This data covers a 5-year period with statistics for 2.7 million hosts. The resource model is based on statistical analysis of host computational power, memory, and storage as well as how these resources change over time and the correlations between them. We find that resources with few discrete values (core count, memory) are well modeled by exponential laws governing the change of relative resource quantities over time. Resources with a continuous range of values are well modeled with either correlated normal distributions (processor speed for integer operations and floating point operations) or log-normal distributions (available disk space). We validate and show the utility of the models by applying them to a resource allocation problem for Internet-distributed applications, and demonstrate their value over other models. We also make our trace data and tool for automatically generating realistic Internet end hosts publicly available

    Enhancing reliability with Latin Square redundancy on desktop grids.

    Get PDF
    Computational grids are some of the largest computer systems in existence today. Unfortunately they are also, in many cases, the least reliable. This research examines the use of redundancy with permutation as a method of improving reliability in computational grid applications. Three primary avenues are explored - development of a new redundancy model, the Replication and Permutation Paradigm (RPP) for computational grids, development of grid simulation software for testing RPP against other redundancy methods and, finally, running a program on a live grid using RPP. An important part of RPP involves distributing data and tasks across the grid in Latin Square fashion. Two theorems and subsequent proofs regarding Latin Squares are developed. The theorems describe the changing position of symbols between the rows of a standard Latin Square. When a symbol is missing because a column is removed the theorems provide a basis for determining the next row and column where the missing symbol can be found. Interesting in their own right, the theorems have implications for redundancy. In terms of the redundancy model, the theorems allow one to state the maximum makespan in the face of missing computational hosts when using Latin Square redundancy. The simulator software was developed and used to compare different data and task distribution schemes on a simulated grid. The software clearly showed the advantage of running RPP, which resulted in faster completion times in the face of computational host failures. The Latin Square method also fails gracefully in that jobs complete with massive node failure while increasing makespan. Finally an Inductive Logic Program (ILP) for pharmacophore search was executed, using a Latin Square redundancy methodology, on a Condor grid in the Dahlem Lab at the University of Louisville Speed School of Engineering. All jobs completed, even in the face of large numbers of randomly generated computational host failures

    A Toolkit for Simulation of Desktop Grid Environment

    Get PDF
    Peer to Peers, clusters and grids enable a combination of heterogeneous distributed recourses to resolve problems in different fields such as science, engineering and commerce. Organizations within the world wide grid environment network are offering geographically distributed resources which are administrated by schedulers and policies. Studying the resources behavior is time consuming due to their unique behavior and uniqueness. In this type of environment it is nearly impossible to prove the effectiveness of a scheduling algorithm. Hence the main objective of this study is to develop a desktop grid simulator toolkit for measuring and modeling scheduler algorithm performance. The selected methodology for the application development is based on prototyping methodology. The prototypes will be developed using JAVA language united with a MySQL database. Core functionality of the simulator are job generation, volunteer generation, simulating algorithms, generating graphical charts and generating reports. A simulator for desktop grid environment has been developed using Java as the implementation language due to its wide popularity. The final system has been developed after a successful delivery of two prototypes. Despite the implementation of the mentioned core functionalities of a desktop grid simulator, advanced features such as viewing real-time graphical charts, generating PDF reports of the simulation result and exporting the final result as CSV files has been also included among the other features

    Effective Scheduling of Grid Resources Using Failure Prediction

    Get PDF
    In large-scale grid environments, accurate failure prediction is critical to achieve effective resource allocation while assuring specified QoS levels, such as reliability. Traditional methods, such as statistical estimation techniques, can be considered to predict the reliability of resources. However, naive statistical methods often ignore critical characteristic behavior of the resources. In particular, periodic behaviors of grid resources are not captured well by statistical methods. In this paper, we present an alternative mechanism for failure prediction. In our approach, the periodic pattern of resource failures are determined and actively exploited for resource allocation with better QoS guarantees. The proposed scheme is evaluated under a realistic simulation environment of computational grids. The availability of computing resources are simulated according to real trace that was collected from our large-scale monitoring experiment on campus computers. Our evaluation results show that the proposed approach enables significantly higher resource scheduling effectiveness under a variety of workloads compared to baseline approaches
    corecore