311 research outputs found
Learning Scheduling Algorithms for Data Processing Clusters
Efficiently scheduling data processing jobs on distributed compute clusters
requires complex algorithms. Current systems, however, use simple generalized
heuristics and ignore workload characteristics, since developing and tuning a
scheduling policy for each workload is infeasible. In this paper, we show that
modern machine learning techniques can generate highly-efficient policies
automatically. Decima uses reinforcement learning (RL) and neural networks to
learn workload-specific scheduling algorithms without any human instruction
beyond a high-level objective such as minimizing average job completion time.
Off-the-shelf RL techniques, however, cannot handle the complexity and scale of
the scheduling problem. To build Decima, we had to develop new representations
for jobs' dependency graphs, design scalable RL models, and invent RL training
methods for dealing with continuous stochastic job arrivals. Our prototype
integration with Spark on a 25-node cluster shows that Decima improves the
average job completion time over hand-tuned scheduling heuristics by at least
21%, achieving up to 2x improvement during periods of high cluster load
Minimizing the sum of flow times with batching and delivery in a supply chain
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The aim of this thesis is to study one of the classical scheduling objectives that is of minimizing the sum of flow times, in the context of a supply chain network. We consider the situation that a supplier schedules a set of jobs for delivery in batches to several manufacturers, who in tum have to schedule and deliver jobs in batches to several customers. The individual problem from the viewpoint of supplier and manufacturers will be considered separately. The decision problem faced by the supplier is that of minimizing the sum of flow time and delivery cost of a set of jobs to be processed on a single machine for delivery in batches to manufacturers. The problem from the viewpoint of manufacturer is similar to the supplier's problem and the only difference is that the scheduling, batching and delivery decisions made by the supplier define a release date for each job, before which the manufacturer cannot start the processing of that job. Also a combined problem in the light of cooperation between the supplier and manufacturer will be considered. The objective of the combined problem is to find the best scheduling, batching, and delivery decisions that benefit the entire system including the supplier and manufacturer. Structural properties of each problem are investigated and used to devise a branch and bound solution scheme. Computational experience shows significant improvements over existing algorithms and also shows that cooperation between a supplier and a manufacturer reduces the total system cost of up to 12.35%, while theoretically the reduction of up to 20% can be achieved for special cases
Scheduling of Batch Processors in Semiconductor Manufacturing – A Review
In this paper a review on scheduling of batch processors (SBP) in semiconductor manufacturing (SM) is presented. It classifies SBP in SM into 12 groups. The suggested classification scheme organizes the SBP in SM literature, summarizes the current research results for different problem types. The classification results are presented based on various distributions and various methodologies applied for SBP in SM are briefly highlighted. A comprehensive list of references is presented. It is hoped that, this review will provide a source for other researchers/readers interested in SBP in SM research and help simulate further interest.Singapore-MIT Alliance (SMA
Online Primal-Dual For Non-linear Optimization with Applications to Speed Scaling
We reinterpret some online greedy algorithms for a class of nonlinear
"load-balancing" problems as solving a mathematical program online. For
example, we consider the problem of assigning jobs to (unrelated) machines to
minimize the sum of the alpha^{th}-powers of the loads plus assignment costs
(the online Generalized Assignment Problem); or choosing paths to connect
terminal pairs to minimize the alpha^{th}-powers of the edge loads (online
routing with speed-scalable routers). We give analyses of these online
algorithms using the dual of the primal program as a lower bound for the
optimal algorithm, much in the spirit of online primal-dual results for linear
problems.
We then observe that a wide class of uni-processor speed scaling problems
(with essentially arbitrary scheduling objectives) can be viewed as such load
balancing problems with linear assignment costs. This connection gives new
algorithms for problems that had resisted solutions using the dominant
potential function approaches used in the speed scaling literature, as well as
alternate, cleaner proofs for other known results
Speed Scaling for Energy Aware Processor Scheduling: Algorithms and Analysis
We present theoretical algorithmic research of processor scheduling in an energy aware environment using the mechanism of speed scaling. We have two main goals in mind. The first is the development of algorithms that allow more energy efficient utilization of resources. The second goal is to further our ability to reason abstractly about energy in computing devices by developing and understanding algorithmic models of energy management. In order to achieve these goals, we investigate three classic process scheduling problems in the setting of a speed scalable processor.
Integer stretch is one of the most obvious classical scheduling objectives that has yet to be considered in the speed scaling setting. For the objective of integer stretch plus energy, we give an online scheduling algorithm that, for any input, produces a schedule with integer stretch plus energy that is competitive with the integer stretch plus energy of any schedule that finishes all jobs.
Second, we consider the problem of finding the schedule, S, that minimizes some quality of service objective Q plus B times the energy used by the processor. This schedule, S, is the optimal energy trade-off schedule in the sense that: no schedule can have better quality of service given the current investment of energy used by S, and, an additional investment of one unit of energy is insufficient to improve the quality of service by more than B. When Q is fractional weighted flow, we show that the optimal energy trade-off schedule is unique and has a simple structure, thus making it easy to check the optimality of a schedule. We further show that the optimal energy trade-off schedule can be computed with a natural homotopic optimization algorithm.
Lastly, we consider the speed scaling problem where the quality of service objective is deadline feasibility and the power objective is temperature. In the case of batched jobs, we give a simple algorithm to compute the optimal schedule. For general instances, we give a new online algorithm and show that it has a competitive ratio that is an order of magnitude better than the best previously known for this problem
An Improved Drift Theorem for Balanced Allocations
In the balanced allocations framework, there are jobs (balls) to be
allocated to servers (bins). The goal is to minimize the gap, the
difference between the maximum and the average load.
Peres, Talwar and Wieder (RSA 2015) used the hyperbolic cosine potential
function to analyze a large family of allocation processes including the
-process and graphical balanced allocations. The key ingredient was
to prove that the potential drops in every step, i.e., a drift inequality.
In this work we improve the drift inequality so that (i) it is asymptotically
tighter, (ii) it assumes weaker preconditions, (iii) it applies not only to
processes allocating to more than one bin in a single step and (iv) to
processes allocating a varying number of balls depending on the sampled bin.
Our applications include the processes of (RSA 2015), but also several new
processes, and we believe that our techniques may lead to further results in
future work.Comment: This paper refines and extends the content on the drift theorem and
applications in arXiv:2203.13902. It consists of 38 pages, 7 figures, 1 tabl
- …