Search CORE

27,673 research outputs found

Co-Scheduling Algorithms for High-Throughput Workload Execution

Author: Aupy Guillaume
Benoit Anne
Raghavan Padma
Robert Yves
Shantharam Manu
Publication venue
Publication date: 29/04/2013
Field of study

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several applications concurrently. We partition the original application set into a series of packs, which are executed one by one. A pack comprises several applications, each of them with an assigned number of processors, with the constraint that the total number of processors assigned within a pack does not exceed the maximum number of available processors. The objective is to determine a partition into packs, and an assignment of processors to applications, that minimize the sum of the execution times of the packs. We thoroughly study the complexity of this optimization problem, and propose several heuristics that exhibit very good performance on a variety of workloads, whose application execution times model profiles of parallel scientific codes. We show that co-scheduling leads to to faster workload completion time and to faster response times on average (hence increasing system throughput and saving energy), for significant benefits over traditional scheduling from both the user and system perspectives

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Efficient Parallelization of Short-Range Molecular Dynamics Simulations on Many-Core Systems

Author: Meyer R.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/11/2013
Field of study

This article introduces a highly parallel algorithm for molecular dynamics simulations with short-range forces on single node multi- and many-core systems. The algorithm is designed to achieve high parallel speedups for strongly inhomogeneous systems like nanodevices or nanostructured materials. In the proposed scheme the calculation of the forces and the generation of neighbor lists is divided into small tasks. The tasks are then executed by a thread pool according to a dependent task schedule. This schedule is constructed in such a way that a particle is never accessed by two threads at the same time.Benchmark simulations on a typical 12 core machine show that the described algorithm achieves excellent parallel efficiencies above 80 % for different kinds of systems and all numbers of cores. For inhomogeneous systems the speedups are strongly superior to those obtained with spatial decomposition. Further benchmarks were performed on an Intel Xeon Phi coprocessor. These simulations demonstrate that the algorithm scales well to large numbers of cores.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

LU|ZONE|UL

A C-DAG task model for scheduling complex real-time tasks on heterogeneous platforms: preemption matters

Author: Bertogna Marko
Capodieci Nicola
Cavicchioli Roberto
Lipari Giuseppe
Zahaf Houssam-Eddine
Publication venue
Publication date: 07/01/2019
Field of study

Recent commercial hardware platforms for embedded real-time systems feature heterogeneous processing units and computing accelerators on the same System-on-Chip. When designing complex real-time application for such architectures, the designer needs to make a number of difficult choices: on which processor should a certain task be implemented? Should a component be implemented in parallel or sequentially? These choices may have a great impact on feasibility, as the difference in the processor internal architectures impact on the tasks' execution time and preemption cost. To help the designer explore the wide space of design choices and tune the scheduling parameters, in this paper we propose a novel real-time application model, called C-DAG, specifically conceived for heterogeneous platforms. A C-DAG allows to specify alternative implementations of the same component of an application for different processing engines to be selected off-line, as well as conditional branches to model if-then-else statements to be selected at run-time. We also propose a schedulability analysis for the C-DAG model and a heuristic allocation algorithm so that all deadlines are respected. Our analysis takes into account the cost of preempting a task, which can be non-negligible on certain processors. We demonstrate the effectiveness of our approach on a large set of synthetic experiments by comparing with state of the art algorithms in the literature

arXiv.org e-Print Archive

HAL Descartes

Integrating Job Parallelism in Real-Time Scheduling Theory

Author: Baker
Baker
Chandra
Geist
Goossens
Gorlatch
Joël Goossens
Leiss
Liliana Cucu
Liu
Manimaran
Srinivasan
Sunderam
Sébastien Collette
Zomaya
Publication venue
Publication date: 01/01/2008
Field of study

We investigate the global scheduling of sporadic, implicit deadline, real-time task systems on multiprocessor platforms. We provide a task model which integrates job parallelism. We prove that the time-complexity of the feasibility problem of these systems is linear relatively to the number of (sporadic) tasks for a fixed number of processors. We propose a scheduling algorithm theoretically optimal (i.e., preemptions and migrations neglected). Moreover, we provide an exact feasibility utilization bound. Lastly, we propose a technique to limit the number of migrations and preemptions

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

DI-fusion