707 research outputs found
Towards Optimality in Parallel Scheduling
To keep pace with Moore's law, chip designers have focused on increasing the
number of cores per chip rather than single core performance. In turn, modern
jobs are often designed to run on any number of cores. However, to effectively
leverage these multi-core chips, one must address the question of how many
cores to assign to each job. Given that jobs receive sublinear speedups from
additional cores, there is an obvious tradeoff: allocating more cores to an
individual job reduces the job's runtime, but in turn decreases the efficiency
of the overall system. We ask how the system should schedule jobs across cores
so as to minimize the mean response time over a stream of incoming jobs.
To answer this question, we develop an analytical model of jobs running on a
multi-core machine. We prove that EQUI, a policy which continuously divides
cores evenly across jobs, is optimal when all jobs follow a single speedup
curve and have exponentially distributed sizes. EQUI requires jobs to change
their level of parallelization while they run. Since this is not possible for
all workloads, we consider a class of "fixed-width" policies, which choose a
single level of parallelization, k, to use for all jobs. We prove that,
surprisingly, it is possible to achieve EQUI's performance without requiring
jobs to change their levels of parallelization by using the optimal fixed level
of parallelization, k*. We also show how to analytically derive the optimal k*
as a function of the system load, the speedup curve, and the job size
distribution.
In the case where jobs may follow different speedup curves, finding a good
scheduling policy is even more challenging. We find that policies like EQUI
which performed well in the case of a single speedup function now perform
poorly. We propose a very simple policy, GREEDY*, which performs near-optimally
when compared to the numerically-derived optimal policy
Integrating Job Parallelism in Real-Time Scheduling Theory
We investigate the global scheduling of sporadic, implicit deadline,
real-time task systems on multiprocessor platforms. We provide a task model
which integrates job parallelism. We prove that the time-complexity of the
feasibility problem of these systems is linear relatively to the number of
(sporadic) tasks for a fixed number of processors. We propose a scheduling
algorithm theoretically optimal (i.e., preemptions and migrations neglected).
Moreover, we provide an exact feasibility utilization bound. Lastly, we propose
a technique to limit the number of migrations and preemptions
Gang FTP scheduling of periodic and parallel rigid real-time tasks
In this paper we consider the scheduling of periodic and parallel rigid
tasks. We provide (and prove correct) an exact schedulability test for Fixed
Task Priority (FTP) Gang scheduler sub-classes: Parallelism Monotonic, Idling,
Limited Gang, and Limited Slack Reclaiming. Additionally, we study the
predictability of our schedulers: we show that Gang FJP schedulers are not
predictable and we identify several sub-classes which are actually predictable.
Moreover, we extend the definition of rigid, moldable and malleable jobs to
recurrent tasks
Scheduling Monotone Moldable Jobs in Linear Time
A moldable job is a job that can be executed on an arbitrary number of
processors, and whose processing time depends on the number of processors
allotted to it. A moldable job is monotone if its work doesn't decrease for an
increasing number of allotted processors. We consider the problem of scheduling
monotone moldable jobs to minimize the makespan.
We argue that for certain compact input encodings a polynomial algorithm has
a running time polynomial in n and log(m), where n is the number of jobs and m
is the number of machines. We describe how monotony of jobs can be used to
counteract the increased problem complexity that arises from compact encodings,
and give tight bounds on the approximability of the problem with compact
encoding: it is NP-hard to solve optimally, but admits a PTAS.
The main focus of this work are efficient approximation algorithms. We
describe different techniques to exploit the monotony of the jobs for better
running times, and present a (3/2+{\epsilon})-approximate algorithm whose
running time is polynomial in log(m) and 1/{\epsilon}, and only linear in the
number n of jobs
A Parallel Divide-and-Conquer based Evolutionary Algorithm for Large-scale Optimization
Large-scale optimization problems that involve thousands of decision
variables have extensively arisen from various industrial areas. As a powerful
optimization tool for many real-world applications, evolutionary algorithms
(EAs) fail to solve the emerging large-scale problems both effectively and
efficiently. In this paper, we propose a novel Divide-and-Conquer (DC) based EA
that can not only produce high-quality solution by solving sub-problems
separately, but also highly utilizes the power of parallel computing by solving
the sub-problems simultaneously. Existing DC-based EAs that were deemed to
enjoy the same advantages of the proposed algorithm, are shown to be
practically incompatible with the parallel computing scheme, unless some
trade-offs are made by compromising the solution quality.Comment: 12 pages, 0 figure
- …