Search CORE

6 research outputs found

Scheduling on Hybrid Platforms: Improved Approximability Window

Author: Fagnon Vincent
Kacem Imed
Lucarelli Giorgio
Simon Bertrand
Publication venue
Publication date: 09/02/2020
Field of study

Modern platforms are using accelerators in conjunction with standard processing units in order to reduce the running time of specific operations, such as matrix operations, and improve their performance. Scheduling on such hybrid platforms is a challenging problem since the algorithms used for the case of homogeneous resources do not adapt well. In this paper we consider the problem of scheduling a set of tasks subject to precedence constraints on hybrid platforms, composed of two types of processing units. We propose a

(3+2\sqrt{2})

-approximation algorithm and a conditional lower bound of 3 on the approximation ratio. These results improve upon the 6-approximation algorithm proposed by Kedad-Sidhoum et al. as well as the lower bound of 2 due to Svensson for identical machines. Our algorithm is inspired by the former one and distinguishes the allocation and the scheduling phases. However, we propose a different allocation procedure which, although is less efficient for the allocation sub-problem, leads to an improved approximation ratio for the whole scheduling problem. This approximation ratio actually decreases when the number of processing units of each type is close and matches the conditional lower bound when they are equal

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Scheduling on Unrelated Machines Under Tree-Like Precedence Constraints

Author: A. Panconesi
A. Schulz
C. Chekuri
D. Bertsimas
D.B. Shmoys
F.A. Chudak
F.T. Leighton
F.T. Leighton
J.K. Lenstra
K. Jansen
L. Hall
L.A. Goldberg
M. Queyranne
M. Skutella
P. Schuurman
U. Feige
Publication venue
Publication date: 01/01/2005
Field of study

Abstract. We present polylogarithmic approximations for the R|prec|Cmax and R|prec | ∑ wjCj problems, when the precedence con

CiteSeerX

Crossref

Формалізація і розв’язання задач оптимального планування робіт за наявності різної продуктивності пристроїв

Author: Галкіна Галина Андріївна
Publication venue: Київ
Publication date: 01/01/2018
Field of study

Магістерська дисертація: 107 с., 12 рис., 10 табл., 7 додатків, 82 джерела. Актуальність. Планування виконання командою наявних завдань є важливим процесом в багатьох галузях, наприклад, у розробці програмного забезпечення. На сьогодні спостерігається зростання популярності покрокового (ітеративного) підходу до виконання робіт у різних сферах нашого життя. Скрам є одним із найбільш поширених гнучких підходів на сьогоднішній день. Ідея методології Скрам полягає у роботі за ітераціями, тобто за деякими фіксованими проміжками часу. У Скрамі ітерації називаються Спринтами. Для ітерації необхідно підібрати набір завдань, які може виконати за цей проміжок часу команда, причому саме такий набір, який принесе найбільшу цінність продуктові, що розробляється. Але обговорення завдань та вирішення, які саме завдання можна взяти на виконання з урахуванням різної продуктивності та досвідченості виконавців, є складним процесом, який займає досить багато часу. Саме тому актуальним є дослідження проблеми оптимального планування виконання завдань, формальна постановка якої призводить до складних оптимізаційних задач. В свою чергу це потребує розробки наближених алгоритмів розв’язування задачі виконання завдань виконавцями з різною для досягнення найбільшої сумарної цінності виконаної роботи. Враховуючи наявну в теорії складання розкладів термінологію та специфіку задачі, вживатимемо терміни “пристрої” та “виконавці” як взаємозамінні. Мета дослідження – підвищення ефективності виконання завдань декількома виконавцями (пристроями) з різною продуктивністю за рахунок зменшення витрат часу на планування їх виконання. Для досягнення мети необхідно виконати наступні завдання: - виконати огляд відомих результатів з поставленої задачі; - виконати формалізацію задачі планування роботи із врахуванням різної продуктивності пристроїв; - розробити наближені алгоритми для розв’язування поставленої задачі; 4 - розробити програмну реалізацію алгоритмів та моделей; - виконати аналіз отриманих результатів. Об’єкт дослідження – процес планування виконання завдань пристроями з різною продуктивністю. Предмет дослідження – методи планування виконання завдань пристроями з різною продуктивністю. Наукова новизна отриманих результатів полягає у формалізації задачі планування роботи на ітерацію у методології Скрам як задачі оптимального планування робіт за наявності різної продуктивності пристроїв; розробці алгоритму для її розв’язування шляхом розбиття на підзадачі; розробці жадібного алгоритму знаходження початкового розв’язку другої підзадачі, процедури генерації точок околу в просторі розв’язків та розробці алгоритмів на основі схеми алгоритмів локального пошуку. Публікації. Матеріали роботи опубліковані у міжнародному журналі «Науковий огляд», №3, 2018 [1, 2]. Зв'язок роботи з науковими програмами, планами, темами. Робота виконувалась у філії кафедри автоматизованих систем обробки інформації та управління Національного технічного університету України «Київський політехнічний інститут ім. Ігоря Сікорського» в рамках науково-дослідної теми Інституту кібернетики ім. В. М. Глушкова НАН України: «Розробити математичний апарат, орієнтований на створення інтелектуальних інформаційних технологій розв’язування проблем комбінаторної оптимізації та інформаційної безпеки» (шифр теми: ВФ.180.11).Master’s thesis: 107 pages, 12 figures, 10 tables, 7 appendix, 82 references. Relevance. Tasks scheduling for a team is an important process in many spheres like software development. Nowadays the iterative approach to work is gaining more and more recognition in different spheres. Scrum is one of the most used agile approaches today. The main idea of Scrum is splitting the work into iterations, where iterations are time spans of fixed length. In Scrum, these iterations are called Sprints. For each iteration, it is necessary to choose such a subset of tasks for the team that the work’s result will have the biggest possible value for the product. But tasks discussion and making decisions about which tasks to include in the current iteration in accordance with the different productivity and experience of team members is a sophisticated process that takes a lot of time. That’s why the research of the optimal scheduling problem formal model of which results in difficult optimization problems is relevant. That requires development of approximate algorithms for solving the scheduling problem with performers having different productivity with the goal of maximizing the work’s result value. Taking into account the terminology of scheduling problems sphere and the problem’s specifics we will use the terms “performer” and “machine” as synonyms. Purpose and objectives of the study. Increasing the effectiveness of tasks finishing by several unrelated performers(machines) by reducing the time spent on the planning of work. To achieve this purpose it is needed to complete these tasks: - perform a review of the known results for the problem that is considered; - perform formalization of the optimal scheduling problem for the unrelated machines; - develop approximate algorithms for solving the problem considered; - develop a software implementation of the algorithms and models; - perform the analysis of the results. 6 The object of study is the scheduling process for the unrelated machines with different productivity. The subject of study are scheduling methods for the unrelated machines with different productivity. Scientific novelty of the results. Formalization of the Sprint planning problem as a scheduling problem for machines with different productivity is performed, an approach to solving this problem based on splitting the problem into two subproblems is suggested; a greedy algorithm, local search algorithms and neighborhood generation procedure for the second subproblem are developed. Publications. Materials were published in the international journal “Naukoviy ohlyad”, №3, 2018 [1, 2]. Connection of the thesis with scientific programs, plans, topics. The thesis was written at the branch of The Department of Department of Computer-aided management and data processing systems of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” at the V. M. Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine under the topic “To develop a mathematical apparatus focused on the creation of intelligent information technologies for solving combinatorial optimization and information security problems”(the topic’s index is ВФ.180.11)

Electronic Archive of Kyiv Polytechnic Institute

Scheduling techniques to improve the worst-case execution time of real-time parallel applications on heterogeneous platforms

Author: Voudouris Petros
Publication venue
Publication date: 01/01/2021
Field of study

The key to providing high performance and energy-efficient execution for hard real-time applications is the time predictable and efficient usage of heterogeneous multiprocessors. However, schedulability analysis of parallel applications executed on unrelated heterogeneous multiprocessors is challenging and has not been investigated adequately by earlier works. The unrelated model is suitable to represent many of the multiprocessor platforms available today because a task (i.e., sequential code) may exhibit a different work-case-execution-time (WCET) on each type of processor on an unrelated heterogeneous multiprocessors platform. A parallel application can be realistically modeled as a directed acyclic graph (DAG), where the nodes are sequential tasks and the edges are dependencies among the tasks. This thesis considers a sporadic DAG model which is used broadly to analyze and verify the real-time requirements of parallel applications. A global work-conserving scheduler can efficiently utilize an unrelated platform by executing the tasks of a DAG on different processor types. However, it is challenging to compute an upper bound on the worst-case schedule length of the DAG, called makespan, which is used to verify whether the deadline of a DAG is met or not. There are two main challenges. First, because of the heterogeneity of the processors, the WCET for each task of the DAG depends on which processor the task is executing on during actual runtime. Second, timing anomalies are the main obstacle to compute the makespan even for the simpler case when all the processors are of the same type, i.e., homogeneous multiprocessors. To that end, this thesis addresses the following problem: How we can schedule multiple sporadic DAGs on unrelated multiprocessors such that all the DAGs meet their deadlines. Initially, the thesis focuses on homogeneous multiprocessors that is a special case of unrelated multiprocessors to understand and tackle the main challenge of timing anomalies. A novel timing-anomaly-free scheduler is proposed which can be used to compute the makespan of a DAG just by simulating the execution of the tasks based on this proposed scheduler. A set of representative task-based parallel OpenMP applications from the BOTS benchmark suite are modeled as DAGs to investigate the timing behavior of real-world applications. A simulation framework is developed to evaluate the proposed method. Furthermore, the thesis targets unrelated multiprocessors and proposes a global scheduler to execute the tasks of a single DAG to an unrelated multiprocessors platform. Based on the proposed scheduler, methods to compute the makespan of a single DAG are introduced. A set of representative parallel applications from the BOTS benchmark suite are modeled as DAGs that execute on unrelated multiprocessors. Furthermore, synthetic DAGs are generated to examine additional structures of parallel applications and various platform capabilities. A simulation framework that simulates the execution of the tasks of a DAG on an unrelated multiprocessor platform is introduced to assess the effectiveness of the proposed makespan computations. Finally, based on the makespan computation of a single DAG this thesis presents the design and schedulability analysis of global and federated scheduling of sporadic DAGs that execute on unrelated multiprocessors

Chalmers Research

Algorithms incorporating concurrency and caching

Author: Fineman Jeremy T
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 189-203).This thesis describes provably good algorithms for modern large-scale computer systems, including today's multicores. Designing efficient algorithms for these systems involves overcoming many challenges, including concurrency (dealing with parallel accesses to the same data) and caching (achieving good memory performance.) This thesis includes two parallel algorithms that focus on testing for atomicity violations in a parallel fork-join program. These algorithms augment a parallel program with a data structure that answers queries about the program's structure, on the fly. Specifically, one data structure, called SP-ordered-bags, maintains the series-parallel relationships among threads, which is vital for uncovering race conditions (bugs) in the program. Another data structure, called XConflict, aids in detecting conflicts in a transactional-memory system with nested parallel transactions. For a program with work T and span To, maintaining either data structure adds an overhead of PT, to the running time of the parallel program when executed on P processors using an efficient scheduler, yielding a total runtime of O(T1/P + PTo). For each of these data structures, queries can be answered in 0(1) time. This thesis also introduces the compressed sparse rows (CSB) storage format for sparse matrices, which allows both Ax and ATx to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz > n nonzeros and x is a dense n-vector. The parallel multiplication algorithm uses e(nnz) work and ... span, yielding a parallelism of ... , which is amply high for virtually any large matrix.(cont.) Also addressing concurrency, this thesis considers two scheduling problems. The first scheduling problem, motivated by transactional memory, considers randomized backoff when jobs have different lengths. I give an analysis showing that binary exponential backoff achieves makespan V2e(6v 1- i ) with high probability, where V is the total length of all n contending jobs. This bound is significantly larger than when jobs are all the same size. A variant of exponential backoff, however, achieves makespan of ... with high probability. I also present the size-hashed backoff protocol, specifically designed for jobs having different lengths, that achieves makespan ... with high probability. The second scheduling problem considers scheduling n unit-length jobs on m unrelated machines, where each job may fail probabilistically. Specifically, an input consists of a set of n jobs, a directed acyclic graph G describing the precedence constraints among jobs, and a failure probability qij for each job j and machine i. The goal is to find a schedule that minimizes the expected makespan. I give an O(log log(min {m, n}))-approximation for the case of independent jobs (when there are no precedence constraints) and an O(log(n + m) log log(min {m, n}))-approximation algorithm when precedence constraints form disjoint chains. This chain algorithm can be extended into one that supports precedence constraints that are trees, which worsens the approximation by another log(n) factor. To address caching, this thesis includes several new variants of cache-oblivious dynamic dictionaries.(cont.) A cache-oblivious dictionary fills the same niche as a classic B-tree, but it does so without tuning for particular memory parameters. Thus, cache-oblivious dictionaries optimize for all levels of a multilevel hierarchy and are more portable than traditional B-trees. I describe how to add concurrency to several previously existing cache-oblivious dictionaries. I also describe two new data structures that achieve significantly cheaper insertions with a small overhead on searches. The cache-oblivious lookahead array (COLA) supports insertions/deletions and searches in O((1/B) log N) and O(log N) memory transfers, respectively, where B is the block size, M is the memory size, and N is the number of elements in the data structure. The xDict supports these operations in O((1/1B E1-) logB(N/M)) and O((1/)0logB(N/M)) memory transfers, respectively, where 0 < E < 1 is a tunable parameter. Also on caching, this thesis answers the question: what is the worst possible page-replacement strategy? The goal of this whimsical chapter is to devise an online strategy that achieves the highest possible fraction of page faults / cache misses as compared to the worst offline strategy. I show that there is no deterministic strategy that is competitive with the worst offline. I also give a randomized strategy based on the most recently used heuristic and show that it is the worst possible pagereplacement policy. On a more serious note, I also show that direct mapping is, in some sense, a worst possible page-replacement policy. Finally, this thesis includes a new algorithm, following a new approach, for the problem of maintaining a topological ordering of a dag as edges are dynamically inserted.(cont.) The main result included here is an O(n2 log n) algorithm for maintaining a topological ordering in the presence of up to m < n(n - 1)/2 edge insertions. In contrast, the previously best algorithm has a total running time of O(min { m3/ 2, n5/2 }). Although these algorithms are not parallel and do not exhibit particularly good locality, some of the data structural techniques employed in my solution are similar to others in this thesis.by Jeremy T. Fineman.Ph.D

DSpace@MIT