1,444 research outputs found
Smooth Inequalities and Equilibrium Inefficiency in Scheduling Games
We study coordination mechanisms for Scheduling Games (with unrelated
machines). In these games, each job represents a player, who needs to choose a
machine for its execution, and intends to complete earliest possible. Our goal
is to design scheduling policies that always admit a pure Nash equilibrium and
guarantee a small price of anarchy for the l_k-norm social cost --- the
objective balances overall quality of service and fairness. We consider
policies with different amount of knowledge about jobs: non-clairvoyant,
strongly-local and local. The analysis relies on the smooth argument together
with adequate inequalities, called smooth inequalities. With this unified
framework, we are able to prove the following results.
First, we study the inefficiency in l_k-norm social costs of a strongly-local
policy SPT and a non-clairvoyant policy EQUI. We show that the price of anarchy
of policy SPT is O(k). We also prove a lower bound of Omega(k/log k) for all
deterministic, non-preemptive, strongly-local and non-waiting policies
(non-waiting policies produce schedules without idle times). These results
ensure that SPT is close to optimal with respect to the class of l_k-norm
social costs. Moreover, we prove that the non-clairvoyant policy EQUI has price
of anarchy O(2^k).
Second, we consider the makespan (l_infty-norm) social cost by making
connection within the l_k-norm functions. We revisit some local policies and
provide simpler, unified proofs from the framework's point of view. With the
highlight of the approach, we derive a local policy Balance. This policy
guarantees a price of anarchy of O(log m), which makes it the currently best
known policy among the anonymous local policies that always admit a pure Nash
equilibrium.Comment: 25 pages, 1 figur
Better Unrelated Machine Scheduling for Weighted Completion Time via Random Offsets from Non-Uniform Distributions
In this paper we consider the classic scheduling problem of minimizing total
weighted completion time on unrelated machines when jobs have release times,
i.e, using the three-field notation. For this
problem, a 2-approximation is known based on a novel convex programming (J. ACM
2001 by Skutella). It has been a long standing open problem if one can improve
upon this 2-approximation (Open Problem 8 in J. of Sched. 1999 by Schuurman and
Woeginger). We answer this question in the affirmative by giving a
1.8786-approximation. We achieve this via a surprisingly simple linear
programming, but a novel rounding algorithm and analysis. A key ingredient of
our algorithm is the use of random offsets sampled from non-uniform
distributions.
We also consider the preemptive version of the problem, i.e, . We again use the idea of sampling offsets from non-uniform
distributions to give the first better than 2-approximation for this problem.
This improvement also requires use of a configuration LP with variables for
each job's complete schedules along with more careful analysis. For both
non-preemptive and preemptive versions, we break the approximation barrier of 2
for the first time.Comment: 24 pages. To apper in FOCS 201
Revisiting Size-Based Scheduling with Estimated Job Sizes
We study size-based schedulers, and focus on the impact of inaccurate job
size information on response time and fairness. Our intent is to revisit
previous results, which allude to performance degradation for even small errors
on job size estimates, thus limiting the applicability of size-based
schedulers.
We show that scheduling performance is tightly connected to workload
characteristics: in the absence of large skew in the job size distribution,
even extremely imprecise estimates suffice to outperform size-oblivious
disciplines. Instead, when job sizes are heavily skewed, known size-based
disciplines suffer.
In this context, we show -- for the first time -- the dichotomy of
over-estimation versus under-estimation. The former is, in general, less
problematic than the latter, as its effects are localized to individual jobs.
Instead, under-estimation leads to severe problems that may affect a large
number of jobs.
We present an approach to mitigate these problems: our technique requires no
complex modifications to original scheduling policies and performs very well.
To support our claim, we proceed with a simulation-based evaluation that covers
an unprecedented large parameter space, which takes into account a variety of
synthetic and real workloads.
As a consequence, we show that size-based scheduling is practical and
outperforms alternatives in a wide array of use-cases, even in presence of
inaccurate size information.Comment: To be published in the proceedings of IEEE MASCOTS 201
Backfilling with fairness and slack for parallel job scheduling
Parallel jobs have different runtimes and numbers of threads/processes. Thus, scheduling parallel jobs involves a packing problem. If jobs are packed as tightly as possible, utilization will be improved. Otherwise, some resources have to stay idle. The common solution to deal with idle resources is backfilling, which schedule smaller jobs submitted later to execute earlier as long as they do not postpone the first job or all the previous jobs in the waiting queue. Traditionally, backfilling uses first fit for idle resources, according to the submission order. However, in this case, better packing of jobs could be missed. Hence, we propose an algorithm which looks further ahead if significantly improving utilization. However at the same time, this could be unfair to some jobs ahead in the queue. So we use a delay factor as a constraint to limit unfairness. We propose a branch and bound algorithm which selects jobs for backfilling which keep utilization high, while trying to stay close to First-Come-First-Served (FCFS). We evaluate relative response time and utilization and compare to other backfilling approaches. The selection of jobs for backfilling to optimize for high utilization and low delay is implemented as an extension of the existing Scojo-PECT preemptive scheduler
Scheduling for today’s computer systems: bridging theory and practice
Scheduling is a fundamental technique for improving performance in computer systems. From web servers
to routers to operating systems, how the bottleneck device is scheduled has an enormous impact on the performance of the system as a whole. Given the immense literature studying scheduling, it is easy to think that we already understand enough about scheduling. But, modern computer system designs have highlighted a number of disconnects between traditional analytic results and the needs of system designers.
In particular, the idealized policies, metrics, and models used by analytic researchers do not match the policies, metrics, and scenarios that appear in real systems.
The goal of this thesis is to take a step towards modernizing the theory of scheduling in order to provide
results that apply to today’s computer systems, and thus ease the burden on system designers. To accomplish
this goal, we provide new results that help to bridge each of the disconnects mentioned above. We will move beyond the study of idealized policies by introducing a new analytic framework where the focus is on scheduling heuristics and techniques rather than individual policies. By moving beyond the study of individual policies, our results apply to the complex hybrid policies that are often used in practice. For example, our results enable designers to understand how the policies that favor small job sizes are affected by the fact that real systems only have estimates of job sizes. In addition, we move beyond the study of mean response time
and provide results characterizing the distribution of response time and the fairness of scheduling policies.
These results allow us to understand how scheduling affects QoS guarantees and whether favoring small job sizes results in large job sizes being treated unfairly. Finally, we move beyond the simplified models traditionally used in scheduling research and provide results characterizing the effectiveness of scheduling in multiserver systems and when users are interactive. These results allow us to answer questions about the how to design multiserver systems and how to choose a workload generator when evaluating new scheduling designs
Preemptive Thread Block Scheduling with Online Structural Runtime Prediction for Concurrent GPGPU Kernels
Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kernels
concurrently. On these GPUs, the thread block scheduler (TBS) uses the FIFO
policy to schedule their thread blocks. We show that FIFO leaves performance to
chance, resulting in significant loss of performance and fairness. To improve
performance and fairness, we propose use of the preemptive Shortest Remaining
Time First (SRTF) policy instead. Although SRTF requires an estimate of runtime
of GPU kernels, we show that such an estimate of the runtime can be easily
obtained using online profiling and exploiting a simple observation on GPU
kernels' grid structure. Specifically, we propose a novel Structural Runtime
Predictor. Using a simple Staircase model of GPU kernel execution, we show that
the runtime of a kernel can be predicted by profiling only the first few thread
blocks. We evaluate an online predictor based on this model on benchmarks from
ERCBench, and find that it can estimate the actual runtime reasonably well
after the execution of only a single thread block. Next, we design a thread
block scheduler that is both concurrent kernel-aware and uses this predictor.
We implement the SRTF policy and evaluate it on two-program workloads from
ERCBench. SRTF improves STP by 1.18x and ANTT by 2.25x over FIFO. When compared
to MPMax, a state-of-the-art resource allocation policy for concurrent kernels,
SRTF improves STP by 1.16x and ANTT by 1.3x. To improve fairness, we also
propose SRTF/Adaptive which controls resource usage of concurrently executing
kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12x, ANTT by
2.23x and Fairness by 2.95x compared to FIFO. Overall, our implementation of
SRTF achieves system throughput to within 12.64% of Shortest Job First (SJF, an
oracle optimal scheduling policy), bridging 49% of the gap between FIFO and
SJF.Comment: 14 pages, full pre-review version of PACT 2014 poste
- …