5,047 research outputs found

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

    Mechanism design for decentralized online machine scheduling

    Get PDF
    Traditional optimization models assume a central decision maker who optimizes a global system performance measure. However, problem data is often distributed among several agents, and agents take autonomous decisions. This gives incentives for strategic behavior of agents, possibly leading to sub-optimal system performance. Furthermore, in dynamic environments, machines are locally dispersed and administratively independent. Examples are found both in business and engineering applications. We investigate such issues for a parallel machine scheduling model where jobs arrive online over time. Instead of centrally assigning jobs to machines, each machine implements a local sequencing rule and jobs decide for machines themselves. In this context, we introduce the concept of a myopic best response equilibrium, a concept weaker than the classical dominant strategy equilibrium, but appropriate for online problems. Our main result is a polynomial time, online mechanism that |assuming rational behavior of jobs| results in an equilibrium schedule that is 3.281-competitive with respect to the maximal social welfare. This is only lightly worse than state-of-the-art algorithms with central coordination

    Games and Mechanism Design in Machine Scheduling – An Introduction

    Get PDF
    In this paper, we survey different models, techniques, and some recent results to tackle machine scheduling problems within a distributed setting. In traditional optimization, a central authority is asked to solve a (computationally hard) optimization problem. In contrast, in distributed settings there are several agents, possibly equipped with private information that is not publicly known, and these agents need to interact in order to derive a solution to the problem. Usually the agents have their individual preferences, which induces them to behave strategically in order to manipulate the resulting solution. Nevertheless, one is often interested in the global performance of such systems. The analysis of such distributed settings requires techniques from classical Optimization, Game Theory, and Economic Theory. The paper therefore briefly introduces the most important of the underlying concepts, and gives a selection of typical research questions and recent results, focussing on applications to machine scheduling problems. This includes the study of the so-called price of anarchy for settings where the agents do not possess private information, as well as the design and analysis of (truthful) mechanisms in settings where the agents do possess private information.computer science applications;

    A common framework and taxonomy for multicriteria scheduling problems with Interfering and competing Jobs: Multi-agent scheduling problems

    Get PDF
    Most classical scheduling research assumes that the objectives sought are common to all jobs to be scheduled. However, many real-life applications can be modeled by considering different sets of jobs, each one with its own objective(s), and an increasing number of papers addressing these problems has appeared over the last few years. Since so far the area lacks a uni ed view, the studied problems have received different names (such as interfering jobs, multi-agent scheduling, mixed-criteria, etc), some authors do not seem to be aware of important contributions in related problems, and solution procedures are often developed without taking into account existing ones. Therefore, the topic is in need of a common framework that allows for a systematic recollection of existing contributions, as well as a clear de nition of the main research avenues. In this paper we review multicriteria scheduling problems involving two or more sets of jobs and propose an uni ed framework providing a common de nition, name and notation for these problems. Moreover, we systematically review and classify the existing contributions in terms of the complexity of the problems and the proposed solution procedures, discuss the main advances, and point out future research lines in the topic

    Modeling and Analysis of Scheduling Problems Containing Renewable Energy Decisions

    Get PDF
    With globally increasing energy demands, world citizens are facing one of society\u27s most critical issues: protecting the environment. To reduce the emission of greenhouse gases (GHG), which are by-products of conventional energy resources, people are reducing the consumption of oil, gas, and coal collectively. In the meanwhile, interest in renewable energy resources has grown in recent years. Renewable generators can be installed both on the power grid side and end-use customer side of power systems. Energy management in power systems with multiple microgrids containing renewable energy resources has been a focus of industry and researchers as of late. Further, on-site renewable energy provides great opportunities for manufacturing plants to reduce energy costs when faced with time-varying electricity prices. To efficiently utilize on-site renewable energy generation, production schedules and energy supply decisions need to be coordinated. As renewable energy resources like solar and wind energy typically fluctuate with weather variations, the inherent stochastic nature of renewable energy resources makes the decision making of utilizing renewable generation complex. In this dissertation, we study a power system with one main grid (arbiter) and multiple microgrids (agents). The microgrids (MGs) are equipped to control their local generation and demand in the presence of uncertain renewable generation and heterogeneous energy management settings. We propose an extension to the classical two-stage stochastic programming model to capture these interactions by modeling the arbiter\u27s problem as the first-stage master problem and the agent decision problems as second-stage subproblems. To tackle this problem formulation, we propose a sequential sampling-based optimization algorithm that does not require a priori knowledge of probability distribution functions or selection of samples for renewable generation. The subproblems capture the details of different energy management settings employed at the agent MGs to control heating, ventilation and air conditioning systems; home appliances; industrial production; plug-in electrical vehicles; and storage devices. Computational experiments conducted on the US western interconnect (WECC-240) data set illustrate that the proposed algorithm is scalable and our solutions are statistically verifiable. Our results also show that the proposed framework can be used as a systematic tool to gauge (a) the impact of energy management settings in efficiently utilizing renewable generation and (b) the role of flexible demands in reducing system costs. Next, we present a two-stage, multi-objective stochastic program for flow shops with sequence-dependent setups in order to meet production schedules while managing energy costs. The first stage provides optimal schedules to minimize the total completion time, while the second stage makes energy supply decisions to minimize energy costs under a time-of-use electricity pricing scheme. Power demand for production is met by on-site renewable generation, supply from the main grid, and an energy storage system. An ε-constraint algorithm integrated with an L-shaped method is proposed to analyze the problem. Sets of Pareto optimal solutions are provided for decision-makers and our results show that the energy cost of setup operations is relatively high such that it cannot be ignored. Further, using solar or wind energy can save significant energy costs with solar energy being the more viable option of the two for reducing costs. Finally, we extend the flow shop scheduling problem to a job shop environment under hour-ahead real-time electricity pricing schemes. The objectives of interest are to minimize total weighted completion time and energy costs simultaneously. Besides renewable generation, hour-ahead real-time electricity pricing is another source of uncertainty in this study as electricity prices are released to customers only hours in advance of consumption. A mathematical model is presented and an ε-constraint algorithm is used to tackle the bi-objective problem. Further, to improve computational efficiency and generate solutions in a practically acceptable amount of time, a hybrid multi-objective evolutionary algorithm based on the Non-dominated Sorting Genetic Algorithm II (NSGA-II) is developed. Five methods are developed to calculate chromosome fitness values. Computational tests show that both mathematical modeling and our proposed algorithm are comparable, while our algorithm produces solutions much quicker. Using a single method (rather than five) to generate schedules can further reduce computational time without significantly degrading solution quality
    corecore