Search CORE

59 research outputs found

Efficient Algorithms for Scheduling Moldable Tasks

Author: Loiseau Patrick
Wu Xiaohu
Publication venue
Publication date: 19/11/2021
Field of study

We study the problem of scheduling

n

independent moldable tasks on

m

processors that arises in large-scale parallel computations. When tasks are monotonic, the best known result is a

(\frac{3}{2}+\epsilon)

-approximation algorithm for makespan minimization with a complexity linear in

n

and polynomial in

\log{m}

and

\frac{1}{\epsilon}

where

\epsilon

is arbitrarily small. We propose a new perspective of the existing speedup models: the speedup of a task

T_{j}

is linear when the number

p

of assigned processors is small (up to a threshold

\delta_{j}

) while it presents monotonicity when

p

ranges in

[\delta_{j}, k_{j}]

; the bound

k_{j}

indicates an unacceptable overhead when parallelizing on too many processors. For a given integer

\delta\geq 5

, let

u=\left\lceil \sqrt[2]{\delta} \right\rceil-1

. In this paper, we propose a

\frac{1}{\theta(\delta)} (1+\epsilon)

-approximation algorithm for makespan minimization with a complexity

\mathcal{O}(n\log{\frac{n}{\epsilon}}\log{m})

where

\theta(\delta) = \frac{u+1}{u+2}\left( 1- \frac{k}{m} \right)

(

m\gg k

). As a by-product, we also propose a

\theta(\delta)

-approximation algorithm for throughput maximization with a common deadline with a complexity

\mathcal{O}(n^{2}\log{m})

arXiv.org e-Print Archive

Scheduling Monotone Moldable Jobs in Linear Time

Author: Jansen Klaus
Land Felix
Publication venue
Publication date: 07/01/2018
Field of study

A moldable job is a job that can be executed on an arbitrary number of processors, and whose processing time depends on the number of processors allotted to it. A moldable job is monotone if its work doesn't decrease for an increasing number of allotted processors. We consider the problem of scheduling monotone moldable jobs to minimize the makespan. We argue that for certain compact input encodings a polynomial algorithm has a running time polynomial in n and log(m), where n is the number of jobs and m is the number of machines. We describe how monotony of jobs can be used to counteract the increased problem complexity that arises from compact encodings, and give tight bounds on the approximability of the problem with compact encoding: it is NP-hard to solve optimally, but admits a PTAS. The main focus of this work are efficient approximation algorithms. We describe different techniques to exploit the monotony of the jobs for better running times, and present a (3/2+{\epsilon})-approximate algorithm whose running time is polynomial in log(m) and 1/{\epsilon}, and only linear in the number n of jobs

arXiv.org e-Print Archive

Crossref

Malleable Scheduling Beyond Identical Machines

Author: Fotakis Dimitris
Matuschke Jannik
Papadigenopoulos Orestis
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Publication date: 01/01/2019
Field of study

In malleable job scheduling, jobs can be executed simultaneously on multiple machines with the processing time depending on the number of allocated machines. Jobs are required to be executed non-preemptively and in unison, in the sense that they occupy, during their execution, the same time interval over all the machines of the allocated set. In this work, we study generalizations of malleable job scheduling inspired by standard scheduling on unrelated machines. Specifically, we introduce a general model of malleable job scheduling, where each machine has a (possibly different) speed for each job, and the processing time of a job j on a set of allocated machines S depends on the total speed of S for j. For machines with unrelated speeds, we show that the optimal makespan cannot be approximated within a factor less than e/(e-1), unless P = NP. On the positive side, we present polynomial-time algorithms with approximation ratios 2e/(e-1) for machines with unrelated speeds, 3 for machines with uniform speeds, and 7/3 for restricted assignments on identical machines. Our algorithms are based on deterministic LP rounding and result in sparse schedules, in the sense that each machine shares at most one job with other machines. We also prove lower bounds on the integrality gap of 1+phi for unrelated speeds (phi is the golden ratio) and 2 for uniform speeds and restricted assignments. To indicate the generality of our approach, we show that it also yields constant factor approximation algorithms (i) for minimizing the sum of weighted completion times; and (ii) a variant where we determine the effective speed of a set of allocated machines based on the L_p norm of their speeds

arXiv.org e-Print Archive

Lirias

Dagstuhl Research Online Publication Server

Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints

Author: Perotin Lucas
Raghavan Padma
Sun Hongyang
Publication venue
Publication date: 13/06/2021
Field of study

The scheduling literature has traditionally focused on a single type of resource (e.g., computing nodes). However, scientific applications in modern High-Performance Computing (HPC) systems process large amounts of data, hence have diverse requirements on different types of resources (e.g., cores, cache, memory, I/O). All of these resources could potentially be exploited by the runtime scheduler to improve the application performance. In this paper, we study multi-resource scheduling to minimize the makespan of computational workflows comprised of parallel jobs subject to precedence constraints. The jobs are assumed to be moldable, allowing the scheduler to flexibly select a variable set of resources before execution. We propose a multi-resource, list-based scheduling algorithm, and prove that, on a system with

d

types of schedulable resources, our algorithm achieves an approximation ratio of

1.619d+2.545\sqrt{d}+1

for any

d

, and a ratio of

d+O(\sqrt[3]{d^2})

for large

d

. We also present improved results for independent jobs and for jobs with special precedence constraints (e.g., series-parallel graphs and trees). Finally, we prove a lower bound of

d

on the approximation ratio of any list scheduling scheme with local priority considerations. To the best of our knowledge, these are the first approximation results for moldable workflows with multiple resource requirements

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Closing the Gap for Pseudo-Polynomial Strip Packing

Author: Jansen Klaus
Rau Malin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

Two-dimensional packing problems are a fundamental class of optimization problems and Strip Packing is one of the most natural and famous among them. Indeed it can be defined in just one sentence: Given a set of rectangular axis parallel items and a strip with bounded width and infinite height, the objective is to find a packing of the items into the strip minimizing the packing height. We speak of pseudo-polynomial Strip Packing if we consider algorithms with pseudo-polynomial running time with respect to the width of the strip. It is known that there is no pseudo-polynomial time algorithm for Strip Packing with a ratio better than 5/4 unless P = NP. The best algorithm so far has a ratio of 4/3 + epsilon. In this paper, we close the gap between inapproximability result and currently known algorithms by presenting an algorithm with approximation ratio 5/4 + epsilon. The algorithm relies on a new structural result which is the main accomplishment of this paper. It states that each optimal solution can be transformed with bounded loss in the objective such that it has one of a polynomial number of different forms thus making the problem tractable by standard techniques, i.e., dynamic programming. To show the conceptual strength of the approach, we extend our result to other problems as well, e.g., Strip Packing with 90 degree rotations and Contiguous Moldable Task Scheduling, and present algorithms with approximation ratio 5/4 + epsilon for these problems as well

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Ordonnancement avec tolérance aux pannes pour des tâches parallèles à nombre de processeurs programmable

Author: Benoit Anne
Le Fèvre Valentin
Perotin Lucas
Raghavan Padma
Robert Yves
Sun Hongyang
Publication venue: HAL CCSD
Publication date: 01/01/2021
Field of study

We study the resilient scheduling of moldable parallel jobs on high-performance computing (HPC) platforms. Moldable jobs allow for choosing a processor allocation before execution, and their execution time obeys various speedup models. The objective is to minimize the overall completion time of the jobs, or the makespan, when jobs can fail due to silent errors and hence may need to be re-executed after each failure until successful completion. Our work generalizes the classical scheduling framework for failure-free jobs. To cope with silent errors, we introduce two resilient scheduling algorithms, LPA-List and Batch-List, both of which use the List strategy to schedule the jobs. Without knowing a priori how many times each job will fail, LPA-List relies on a local strategy to allocate processors to the jobs, while Batch-List schedules the jobs in batches and allows only a restricted number of failures per job in each batch. We prove new approximation ratios for the two algorithms under several prominent speedup models (e.g., roofline, communication, Amdahl, power, monotonic, and a mixed model). An extensive set of simulations is conducted to evaluate different variants of the two algorithms, and the results show that they consistently outperform some baseline heuristics. Overall, our best algorithm is within a factor of 1.6 of a lower bound on average over the entire set of experiments, and within a factor of 4.2 in the worst case.Ce rapport étudie l’ordonnancement résilient de tâches sur des plateformes de calcul à haute performance. Dans le problème étudié, il est possible de choisir le nombre constant de processeurs effectuant chaque tâche, en déterminant le temps d’exécution de ces dernières selon différent modèles de rendement. Nous décrivons des algorithmes dont l’objectif est deminimiser le temps total d’exécution, sachant que les tâches sont susceptibles d’échouer et de devoir être ré-effectuées à chaque erreur. Ce problème est donc une généralisation du cadre classique où toutes les tâches sont connues à priori et n’échouent pas. Nous décrivons un algorithme d’ordonnancement par listes de priorité, et prouvons de nouvelles bornes d’approximation pour trois modèles de rendement classiques (roofline, communication, Amdahl, power, monotonic, et un modèle qui mélange ceux-ci). Nous décrivons également un algorithme d’ordonnancement par lots, au sein desquels les tâches pourront échouer un nombre limité de fois, et prouvons alors de nouvelles bornes d’approximation pour des rendements quelconques. Enfin, nous effectuons des expériences sur un ensemble complet d’exemples pour comparer les niveaux de performance de différentes variantes de nos algorithmes, significativement meilleurs que les algorithmes simples usuels. Notre meilleure heuristique est en moyenne à un facteur

1.6

d’une borne inférieure de la solution optimale, et à un facteur

4.2

dans le pire cas

INRIA a CCSD electronic archive server

04231 Abstracts Collection -- Scheduling in Computer and Manufacturing Systems

Author: Blazewicz Jacek
Ecker Klaus
Pesch Erwin
Trystram Denis
Publication venue: Dagstuhl Seminar Proceedings. 04231 - Scheduling in Computer and Manufacturing Systems
Publication date: 01/01/2004
Field of study

During 31.05.-04.06.04, the Dagstuhl Seminar 04231 "Scheduling in Computer and Manufacturing Systems" was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Theory and Engineering of Scheduling Parallel Jobs

Author: Speck Jochen Matthias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

Scheduling is very important for an efficient utilization of modern parallel computing systems. In this thesis, four main research areas for scheduling are investigated: the interplay and distribution of decision makers, the efficient schedule computation, efficient scheduling for the memory hierarchy and energy-efficiency. The main result is a provably fast and efficient scheduling algorithm for malleable jobs. Experiments show the importance and possibilities of scheduling considering the memory hierarchy

KITopen

Scheduling moldable {BSP} tasks

Author: Dutot Pierre-François
Goldman Alfredo
Kon Fabio
Netto Marco
Publication venue: Springer Verlag
Publication date: 19/06/2005
Field of study

Our main goal in this paper is to study the scheduling of parallel BSP tasks on clusters of computers. We focus our attention on special characteristics of BSP tasks, which can use less processors than the original required, but with a particular cost model. We discuss the problem of scheduling a batch of BSP tasks on a fixed number of computers. The objective is to minimize the completion time of the last task (makespan). We show that the problem is difficult and present approximation algorithms and heuristics. We finish the paper presenting the results of extensive simulations under different workloads

INRIA a CCSD electronic archive server

An Empirical Evaluation of Multi-Resource Scheduling for Moldable Workflows

Author: Kandaswamy Sandhya
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2022
Field of study

Resource scheduling plays a vital role in High-Performance Computing (HPC) systems. However, most scheduling research in HPC has focused on only a single type of resource (e.g., computing cores or I/O resources). With the advancement in hardware architectures and the increase in data-intensive HPC applications, there is a need to simultaneously embrace a diverse set of resources (e.g., computing cores, cache, memory, I/O, and network resources) in the design of run-time schedulers for improving the overall application performance. This thesis performs an empirical evaluation of a recently proposed multi-resource scheduling algorithm for minimizing the overall completion time (or makespan) of computational workflows comprised of moldable parallel jobs. Moldable parallel jobs allow the scheduler to select the resource allocations at launch time and thus can adapt to the available system resources (as compared to rigid jobs) while staying easy to design and implement (as compared to malleable jobs). The algorithm was proven to have a worst-case approximation ratio that grows linearly with the number of resource types for moldable workflows. In this thesis, a comprehensive set of simulations is conducted to empirically evaluate the performance of the algorithm using synthetic workflows generated by DAGGEN and moldable jobs that exhibit different speedup profiles. The results show that the algorithm fares better than the theoretical bound predicts, and it consistently outperforms two baseline heuristics under a variety of parameter settings, illustrating its robust practical performance

KU ScholarWorks