Search CORE

918 research outputs found

Advanced list scheduling heuristic for task scheduling with communication contention for parallel embedded systems

Author: Cousin Jean-Gabriel
Mu Pengcheng
Nezan Jean François
Raulet Mickael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/10/2010
Field of study

WOSInternational audienceModern embedded systems tend to use multiple cores or processors for processing parallel applications. This paper indeed aims at task scheduling with communication contention for parallel embedded systems and proposes three advanced techniques to improve the list scheduling heuristic. Five groups of node levels (two existing groups and three new groups) are firstly used as node priorities to generate node lists. Then the critical child technique improves the selection of a processor in the scheduling process. Finally, the communication delay technique enlarges the idle time intervals on communication links. We also propose an advanced dynamic list scheduling heuristic by combining the three techniques. Experimental results show that the combined advanced dynamic heuristic is efficient to shorten the schedule length for most of the randomly generated DAGs in the cases of medium and high communication. Our method accelerates an application up to 80% in the case of high communication and can also reduce the use of hardware resources

A List Scheduling Heuristic with New Node Priorities and Critical Child Technique for Task Scheduling with Communication Contention

Author: E Lee
EA Lee
G Sih
H Kasahara
JJ Hwang
MR Garey
MY Wu
O Sinnen
O Sinnen
O Sinnen
S Sriram
S Stuijk
T Yang
TL Adam
V Sarkar
X Tang
YK Kwok
YK Kwok
YK Kwok
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

Task scheduling is becoming an important aspect for parallel programming of modern embedded systems. In this chapter, the application to be scheduled is modeled as a Directed Acyclic Graph (DAG), and the architecture targets parallel embedded systems composed of multiple processors interconnected by buses and/or switches. This chapter presents new list scheduling heuristics with communication contention. Furthermore, we define new node priorities (top level and bottom level) to sort nodes, and propose an advanced technique named critical child to select a processor to execute a node. Experimental results show that the proposed method is effective to reduce the schedule length, and the runtime performance is greatly improved in the cases of medium and high communication. Since the communication cost is increasing from medium to high in modern applications like digital communication and video compression, the proposed method is well-adapted for scheduling these applications over parallel embedded systems

Link contention-constrained scheduling and mapping of tasks and messages to a network of heterogeneous processors

Author: Ahmad I
Kwok YK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

In this paper, we consider the problem of scheduling and mapping precedence-constrained tasks to a network of heterogeneous processors. In such systems, processors are usually physically distributed, implying that the communication cost is considerably higher than in tightly coupled multiprocessors. Therefore, scheduling and mapping algorithms for such systems must schedule the tasks as well as the communication traffic by treating both the processors and communication links as important resources. We propose an algorithm that achieves these objectives and adapts its tasks scheduling and mapping decisions according to the given network topology. Just like tasks, messages are also scheduled and mapped to suitable links during the minimization of the finish times of tasks. Heterogeneity of processors is exploited by scheduling critical tasks to the fastest processors. Our extensive experimental study has demonstrated that the proposed algorithm is efficient, robust, and yields consistent performance over a wide range of scheduling parameters.published_or_final_versio

HKU Scholars Hub

CASCH: a tool for computer-aided scheduling

Author: Ahmad I
Kwok YK
Shu W
Wu MY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

A software tool called Computer-Aided Scheduling (CASCH) for parallel processing on distributed-memory multiprocessors in a complete parallel programming environment is presented. A compiler automatically converts sequential applications into parallel codes to perform program parallelization. The parallel code that executes on a target machine is optimized by CASCH through proper scheduling and mapping.published_or_final_versio

HKU Scholars Hub

Using utilization profiles in allocation and partitioning for multiproscessor systems

Author: Evans John
Publication venue: University of Utah
Publication date: 01/01/1992
Field of study

technical reportThe problems of multiprocessor partitioning and program allocation are interdependent and critical to the performance of multiprocessor systems?? Minimizing resource partitions for parallel programs on partitionable multiprocessors facilitates greater processor utilization and throughput?? The pro cessing resource requirements of parallel programs vary during program execution and are allocation dependent?? Optimal resource utilization requires that resource requirements be modeled as variable over time?? This paper investigates the use of program pro les in allocating programs and parti tioning multiprocessor systems?? An allocation method is discussed?? The goals of this method are to minimize program execution time minimize the total number of processors used characterize variation in processor requirements over the lifetime of a program to accurately predict the impact on run time of the number of processors available at any point in time and to minimize uctuations in processor requirements to facilitate e cient sharing of processors between partitions on a partitionable multiprocessor?? An application to program partitioning is discussed that improves partition run times compared to other methods?

The University of Utah: J. Willard Marriott Digital Library

Parallelizing with BDSC, a resource-constrained scheduling algorithm for shared and distributed memory systems

Author: Ancourt Corinne
Jouvelot Pierre
Khaldi Dounia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceWe introduce a new parallelization framework for scientific computing based on BDSC, an efficient automatic scheduling algorithm for parallel programs in the presence of resource constraints on the number of processors and their local memory size. BDSC extends Yang and Gerasoulis's Dominant Sequence Clus-tering (DSC) algorithm; it uses sophisticated cost models and addresses both shared and distributed parallel memory architectures. We describe BDSC, its integration within the PIPS compiler infrastructure and its application to the parallelization of four well-known scientific applications: Harris, ABF, equake and IS. Our experiments suggest that BDSC's focus on efficient resource man-agement leads to significant parallelization speedups on both shared and dis-tributed memory systems, improving upon DSC results, as shown by the com-parison of the sequential and parallelized versions of these four applications running on both OpenMP and MPI frameworks

Crossref

HAL Descartes

HAL-MINES ParisTech