Search CORE

6,366 research outputs found

A Bag-of-Tasks Scheduler Tolerant to Temporal Failures in Clouds

Author: Arantes Luciana
Drummond Lúcia Maria de A.
Sens Pierre
Teylo Luan
Publication venue
Publication date: 24/10/2018
Field of study

Cloud platforms have emerged as a prominent environment to execute high performance computing (HPC) applications providing on-demand resources as well as scalability. They usually offer different classes of Virtual Machines (VMs) which ensure different guarantees in terms of availability and volatility, provisioning the same resource through multiple pricing models. For instance, in Amazon EC2 cloud, the user pays per hour for on-demand VMs while spot VMs are unused instances available for lower price. Despite the monetary advantages, a spot VM can be terminated, stopped, or hibernated by EC2 at any moment. Using both hibernation-prone spot VMs (for cost sake) and on-demand VMs, we propose in this paper a static scheduling for HPC applications which are composed by independent tasks (bag-of-task) with deadline constraints. However, if a spot VM hibernates and it does not resume within a time which guarantees the application's deadline, a temporal failure takes place. Our scheduling, thus, aims at minimizing monetary costs of bag-of-tasks applications in EC2 cloud, respecting its deadline and avoiding temporal failures. To this end, our algorithm statically creates two scheduling maps: (i) the first one contains, for each task, its starting time and on which VM (i.e., an available spot or on-demand VM with the current lowest price) the task should execute; (ii) the second one contains, for each task allocated on a VM spot in the first map, its starting time and on which on-demand VM it should be executed to meet the application deadline in order to avoid temporal failures. The latter will be used whenever the hibernation period of a spot VM exceeds a time limit. Performance results from simulation with task execution traces, configuration of Amazon EC2 VM classes, and VMs market history confirms the effectiveness of our scheduling and that it tolerates temporal failures

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Learning Scheduling Algorithms for Data Processing Clusters

Author: Abadi Martín
Addanki Ravichandra
Dai Hanjun
Finn Chelsea
Ghodsi Ali
Gog Ionel
Grandl Robert
Greensmith Evan
Hindman Benjamin
Kingma Diederik P
Mao Hongzi
Mao Hongzi
Marcus Ryan
Mirhoseini Azalia
Mirhoseini Azalia
Pinto Lerrel
Schulman John
Spark Apache
Sutton S.
Weaver Lex
Zaharia Matei
Publication venue
Publication date: 21/08/2019
Field of study

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Collocation Games and Their Application to Distributed Resource Management

Author: Bestavros Azer
Londoño Jorge
Teng Shang-Hua
Publication venue: Boston University Computer Science Department
Publication date: 07/02/2009
Field of study

We introduce Collocation Games as the basis of a general framework for modeling, analyzing, and facilitating the interactions between the various stakeholders in distributed systems in general, and in cloud computing environments in particular. Cloud computing enables fixed-capacity (processing, communication, and storage) resources to be offered by infrastructure providers as commodities for sale at a fixed cost in an open marketplace to independent, rational parties (players) interested in setting up their own applications over the Internet. Virtualization technologies enable the partitioning of such fixed-capacity resources so as to allow each player to dynamically acquire appropriate fractions of the resources for unencumbered use. In such a paradigm, the resource management problem reduces to that of partitioning the entire set of applications (players) into subsets, each of which is assigned to fixed-capacity cloud resources. If the infrastructure and the various applications are under a single administrative domain, this partitioning reduces to an optimization problem whose objective is to minimize the overall deployment cost. In a marketplace, in which the infrastructure provider is interested in maximizing its own profit, and in which each player is interested in minimizing its own cost, it should be evident that a global optimization is precisely the wrong framework. Rather, in this paper we use a game-theoretic framework in which the assignment of players to fixed-capacity resources is the outcome of a strategic "Collocation Game". Although we show that determining the existence of an equilibrium for collocation games in general is NP-hard, we present a number of simplified, practically-motivated variants of the collocation game for which we establish convergence to a Nash Equilibrium, and for which we derive convergence and price of anarchy bounds. In addition to these analytical results, we present an experimental evaluation of implementations of some of these variants for cloud infrastructures consisting of a collection of multidimensional resources of homogeneous or heterogeneous capacities. Experimental results using trace-driven simulations and synthetically generated datasets corroborate our analytical results and also illustrate how collocation games offer a feasible distributed resource management alternative for autonomic/self-organizing systems, in which the adoption of a global optimization approach (centralized or distributed) would be neither practical nor justifiable.NSF (CCF-0820138, CSR-0720604, EFRI-0735974, CNS-0524477, CNS-052016, CCR-0635102); Universidad Pontificia Bolivariana; COLCIENCIAS–Instituto Colombiano para el Desarrollo de la Ciencia y la Tecnología "Francisco José de Caldas

Boston University Institutional Repository (OpenBU)

MorphoSys: efficient colocation of QoS-constrained workloads in the cloud

Author: Bestavros Azer
Ishakian Vatche
Publication venue: Computer Science Department, Boston University
Publication date: 25/01/2011
Field of study

In hosting environments such as IaaS clouds, desirable application performance is usually guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated for unencumbered use for proper operation. Arbitrary colocation of applications with different SLAs on a single host may result in inefficient utilization of the host’s resources. In this paper, we propose that periodic resource allocation and consumption models -- often used to characterize real-time workloads -- be used for a more granular expression of SLAs. Our proposed SLA model has the salient feature that it exposes flexibilities that enable the infrastructure provider to safely transform SLAs from one form to another for the purpose of achieving more efficient colocation. Towards that goal, we present MORPHOSYS: a framework for a service that allows the manipulation of SLAs to enable efficient colocation of arbitrary workloads in a dynamic setting. We present results from extensive trace-driven simulations of colocated Video-on-Demand servers in a cloud setting. These results show that potentially-significant reduction in wasted resources (by as much as 60%) are possible using MORPHOSYS.National Science Foundation (0720604, 0735974, 0820138, 0952145, 1012798

Crossref

Boston University Institutional Repository (OpenBU)

Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey

Author: Chakravarthi K. Kalyana
Vijayakumar Vaidhehi
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2018
Field of study

In the modern era, workflows are adopted as a powerful and attractive paradigm for expressing/solving a variety of applications like scientific, data intensive computing, and big data applications such as MapReduce and Hadoop. These complex applications are described using high-level representations in workflow methods. With the emerging model of cloud computing technology, scheduling in the cloud becomes the important research topic. Consequently, workflow scheduling problem has been studied extensively over the past few years, from homogeneous clusters, grids to the most recent paradigm, cloud computing. The challenges that need to be addressed lies in task-resource mapping, QoS requirements, resource provisioning, performance fluctuation, failure handling, resource scheduling, and data storage. This work focuses on the complete study of the resource provisioning and scheduling algorithms in cloud environment focusing on Infrastructure as a service (IaaS). We provided a comprehensive understanding of existing scheduling techniques and provided an insight into research challenges that will be a possible future direction to the researchers

IAES journal

Crossref

Institute of Advanced Engineering and Science

A heuristic approach for the allocation of resources in large-scale computing infrastructures

Author: Alrawahi
Amazon
Armbrust
Backhaus
Bapna
Brynjolfsson
Carr
Cripps
Dash
Davenport
Foster
Garey
Garg
Hey
Hooker
Lai
Lehmann
Lewis
Leyton-Brown
Leyton-Brown
M. Vaquero
Marshall
Milojičić
Mossmann
Neumann
Nissan
Nurmi
Pekeĉ
Pepple
Rardin
Reiss
Roberts
Schnizler
Shneidman
Siegle
Smidt
Sotomayor
Stösser
Stösser
Vries
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

An increasing number of enterprise applications are intensive in their consumption of IT, but are infrequently used. Consequently, organizations either host an oversized IT infrastructure or they are incapable of realizing the benefits of new applications. A solution to the challenge is provided by the large-scale computing infrastructures of Clouds and Grids which allow resources to be shared. A major challenge is the development of mechanisms that allow efficient sharing of IT resources. Market mechanisms are promising, but there is a lack of research in scalable market mechanisms. We extend the Multi-Attribute Combinatorial Exchange mechanism with greedy heuristics to address the scalability challenge. The evaluation shows a trade-off between efficiency and scalability. There is no statistical evidence for an influence on the incentive properties of the market mechanism. This is an encouraging result as theory predicts heuristics to ruin the mechanism’s incentive properties. Copyright © 2015 John Wiley & Sons, Ltd

OPUS Augsburg

Deakin Research Online

Crossref

Nottingham Trent Institutional Repository (IRep)

Research Repository

QoS-aware Scientific Application Scheduling Algorithm in Cloud Environment

Author: Mazandarani Ayda
Momeni Hossein
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 29/11/2013
Field of study

Many complex scientific applications are modeled in the form of workflows to carry out large-scale experiments. Because of complexity of scientific processes, scientific workflows need intensive computation and data requirements. Clouds make opportunity for scientific that need high performance computing infrastructure. So scientific can run their application on cloud by their desired QoS. We propose an algorithm that able scientific to select execute plan based on their preference QoS, like time and cost. Proposed algorithm ranks the tasks in workflow and then use UPFF function for select accurate resource, based on user’s QoS. We compared our proposed algorithm with the same work by several scenarios and results show proposed algorithm has better efficiency. Keywords Scientific application, Workflow scheduling, Cloud computin

International Institute for Science, Technology and Education (IISTE): E-Journals

Workflow scheduling for service oriented cloud computing

Author: Fida Adnan
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2008
Field of study

Service Orientation (SO) and grid computing are two computing paradigms that when put together using Internet technologies promise to provide a scalable yet flexible computing platform for a diverse set of distributed computing applications. This practice gives rise to the notion of a computing cloud that addresses some previous limitations of interoperability, resource sharing and utilization within distributed computing. In such a Service Oriented Computing Cloud (SOCC), applications are formed by composing a set of services together. In addition, hierarchical service layers are also possible where general purpose services at lower layers are composed to deliver more domain specific services at the higher layer. In general an SOCC is a horizontally scalable computing platform that offers its resources as services in a standardized fashion. Workflow based applications are a suitable target for SOCC where workflow tasks are executed via service calls within the cloud. One or more workflows can be deployed over an SOCC and their execution requires scheduling of services to workflow tasks as the task become ready following their interdependencies. In this thesis heuristics based scheduling policies are evaluated for scheduling workflows over a collection of services offered by the SOCC. Various execution scenarios and workflow characteristics are considered to understand the implication of the heuristic based workflow scheduling

eCommons@USASK

University of Saskatchewan Research Archive