460 research outputs found
Scheduling Parallel Jobs on a Network of Heterogeneous Platforms
We consider the problem of scheduling parallel jobs on a network of heterogeneous platforms. Given a set J of n jobs where each job j 2 J is described by a pair (pj ; qj) with a processing time pj and number qj of processors required and a set of N heterogeneous platforms Pi with mi processors, the goal is to and a schedule for all jobs on the platforms minimizing the maximum completion time. Unless P = NP there is no approximation algorithm with absolute ratio better than 2 for the problem. We propose an approximation algorithm with absolute ratio 2 improving the previously best known algorithms. This closes the gap between the lower bound of 2 and the best approximation ratio
From Preemptive to Non-preemptive Scheduling Using Rejections
International audienceWe study the classical problem of scheduling a set of independent jobs with release dates on a single machine. There exists a huge literature on the preemptive version of the problem, where the jobs can be interrupted at any moment. However, we focus here on the non-preemptive case, which is harder, but more relevant in practice. For instance, the jobs submitted to actual high performance platforms cannot be interrupted or migrated once they start their execution (due to prohibitive management overhead). We target on the minimization of the total stretch objective, defined as the ratio of the total time a job stays in the system (waiting time plus execution time), normalized by its processing time. Stretch captures the quality of service of a job and the minimum total stretch reflects the fairness between the jobs. So far, there have been only few studies about this problem, especially for the non-preemptive case. Our approach is based to the usage of the classical and efficient for the preemptive case shortest remaining processing time (SRPT) policy as a lower bound. We investigate the (offline) transformation of the SRPT schedule to a non-preemptive schedule subject to a recently introduced resource augmentation model, namely the rejection model according to which we are allowed to reject a small fraction of jobs. Specifically, we propose a 2 Ç«-approximation algorithm for the total stretch minimization problem if we allow to reject an Ç«-fraction of the jobs, for any Ç« > 0. This result shows that the rejection model is more powerful than the other resource augmentations models studied in the literature, like speed augmentation or machine augmentation, for which non-polynomial or non-scalable results are known. As a byproduct, we present a O(1)-approximation algorithm for the total flow-time minimization problem which also rejects at most an \epsilon-fraction of jobs
Decentralized List Scheduling
Classical list scheduling is a very popular and efficient technique for
scheduling jobs in parallel and distributed platforms. It is inherently
centralized. However, with the increasing number of processors, the cost for
managing a single centralized list becomes too prohibitive. A suitable approach
to reduce the contention is to distribute the list among the computational
units: each processor has only a local view of the work to execute. Thus, the
scheduler is no longer greedy and standard performance guarantees are lost.
The objective of this work is to study the extra cost that must be paid when
the list is distributed among the computational units. We first present a
general methodology for computing the expected makespan based on the analysis
of an adequate potential function which represents the load unbalance between
the local lists. We obtain an equation on the evolution of the potential by
computing its expected decrease in one step of the schedule. Our main theorem
shows how to solve such equations to bound the makespan. Then, we apply this
method to several scheduling problems, namely, for unit independent tasks, for
weighted independent tasks and for tasks with precendence constraints. More
precisely, we prove that the time for scheduling a global workload W composed
of independent unit tasks on m processors is equal to W/m plus an additional
term proportional to log_2 W. We provide a lower bound which shows that this is
optimal up to a constant. This result is extended to the case of weighted
independent tasks. In the last setting, precedence task graphs, our analysis
leads to an improvement on the bound of Arora et al. We finally provide some
experiments using a simulator. The distribution of the makespan is shown to fit
existing probability laws. The additive term is shown by simulation to be
around 3 \log_2 W confirming the tightness of our analysis
Communication models insights meet simulations
International audienceIt is well-known that taking into account communications while scheduling jobs in large scale parallel computing platforms is a crucial issue. In modern hierarchical platforms, communication times are highly different when occurring inside a cluster or between clusters. Thus, allocating the jobs taking into account locality constraints is a key factor for reaching good performances. However, several theoretical results prove that imposing such constraints reduces the solution space and thus, possibly degrades the performances. In practice, such constraints simplify implementations and most often lead to better results. Our aim in this work is to bridge theoretical and practical intuitions, and check the differences between constrained and unconstrained schedules (namely with respect to locality and node contiguity) through simulations. We have developped a generic tool, using SimGrid as the base simulator, enabling interactions with external batch schedulers to evaluate their scheduling policies. The results confirm that insights gained through theoretical models are ill-suited to current architectures and should be reevaluated
- …