Search CORE

290 research outputs found

Advanced list scheduling heuristic for task scheduling with communication contention for parallel embedded systems

Author: Cousin Jean-Gabriel
Mu Pengcheng
Nezan Jean François
Raulet Mickael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2010
Field of study

WOSInternational audienceModern embedded systems tend to use multiple cores or processors for processing parallel applications. This paper indeed aims at task scheduling with communication contention for parallel embedded systems and proposes three advanced techniques to improve the list scheduling heuristic. Five groups of node levels (two existing groups and three new groups) are firstly used as node priorities to generate node lists. Then the critical child technique improves the selection of a processor in the scheduling process. Finally, the communication delay technique enlarges the idle time intervals on communication links. We also propose an advanced dynamic list scheduling heuristic by combining the three techniques. Experimental results show that the combined advanced dynamic heuristic is efficient to shorten the schedule length for most of the randomly generated DAGs in the cases of medium and high communication. Our method accelerates an application up to 80% in the case of high communication and can also reduce the use of hardware resources

HAL-CentraleSupelec

HAL-Rennes 1

A List Scheduling Heuristic with New Node Priorities and Critical Child Technique for Task Scheduling with Communication Contention

Author: E Lee
EA Lee
G Sih
H Kasahara
JJ Hwang
MR Garey
MY Wu
O Sinnen
O Sinnen
O Sinnen
S Sriram
S Stuijk
T Yang
TL Adam
V Sarkar
X Tang
YK Kwok
YK Kwok
YK Kwok
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

Task scheduling is becoming an important aspect for parallel programming of modern embedded systems. In this chapter, the application to be scheduled is modeled as a Directed Acyclic Graph (DAG), and the architecture targets parallel embedded systems composed of multiple processors interconnected by buses and/or switches. This chapter presents new list scheduling heuristics with communication contention. Furthermore, we define new node priorities (top level and bottom level) to sort nodes, and propose an advanced technique named critical child to select a processor to execute a node. Experimental results show that the proposed method is effective to reduce the schedule length, and the runtime performance is greatly improved in the cases of medium and high communication. Since the communication cost is increasing from medium to high in modern applications like digital communication and video compression, the proposed method is well-adapted for scheduling these applications over parallel embedded systems

HAL-CentraleSupelec

Crossref

HAL-Rennes 1

Workflow Scheduling Techniques and Algorithms in IaaS Cloud: A Survey

Author: Chakravarthi K. Kalyana
Vijayakumar Vaidhehi
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2018
Field of study

In the modern era, workflows are adopted as a powerful and attractive paradigm for expressing/solving a variety of applications like scientific, data intensive computing, and big data applications such as MapReduce and Hadoop. These complex applications are described using high-level representations in workflow methods. With the emerging model of cloud computing technology, scheduling in the cloud becomes the important research topic. Consequently, workflow scheduling problem has been studied extensively over the past few years, from homogeneous clusters, grids to the most recent paradigm, cloud computing. The challenges that need to be addressed lies in task-resource mapping, QoS requirements, resource provisioning, performance fluctuation, failure handling, resource scheduling, and data storage. This work focuses on the complete study of the resource provisioning and scheduling algorithms in cloud environment focusing on Infrastructure as a service (IaaS). We provided a comprehensive understanding of existing scheduling techniques and provided an insight into research challenges that will be a possible future direction to the researchers

IAES journal

Crossref

Institute of Advanced Engineering and Science

Critical Path Scheduling Parallel Programs on an Unbounded Number of Processors

Author: Butelle Franck
Hakem Mourad
Publication venue: World Scientific Publishing House Ltd
Publication date: 01/01/2006
Field of study

International audienceIn this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallel processing systems with distributed memory, which is called The Dynamic Critical Path Scheduling DCPS. The DCPS is superior to several other algorithms from the literature in terms of computational complexity, processors consumption and solution quality. DCPS has a time complexity of O (e + v\log v), as opposed to DSC algorithm O((e + v)\log v) which is the best known algorithm. Experimental results demonstrate the superiority of DCPS over the DSC algorithm

HAL-Paris 13

A Task Scheduling Method after Clustering for Data Intensive Jobs in Heterogeneous Distributed Systems

Author
Publication venue: College of Technology,Dai-Ichi University
Publication date: 01/03/2016
Field of study

Kagoshima Academic Repository Network (鹿児島県学術共同リポジトリ)

Scheduling techniques to improve the worst-case execution time of real-time parallel applications on heterogeneous platforms

Author: Voudouris Petros
Publication venue
Publication date: 01/01/2021
Field of study

The key to providing high performance and energy-efficient execution for hard real-time applications is the time predictable and efficient usage of heterogeneous multiprocessors. However, schedulability analysis of parallel applications executed on unrelated heterogeneous multiprocessors is challenging and has not been investigated adequately by earlier works. The unrelated model is suitable to represent many of the multiprocessor platforms available today because a task (i.e., sequential code) may exhibit a different work-case-execution-time (WCET) on each type of processor on an unrelated heterogeneous multiprocessors platform. A parallel application can be realistically modeled as a directed acyclic graph (DAG), where the nodes are sequential tasks and the edges are dependencies among the tasks. This thesis considers a sporadic DAG model which is used broadly to analyze and verify the real-time requirements of parallel applications. A global work-conserving scheduler can efficiently utilize an unrelated platform by executing the tasks of a DAG on different processor types. However, it is challenging to compute an upper bound on the worst-case schedule length of the DAG, called makespan, which is used to verify whether the deadline of a DAG is met or not. There are two main challenges. First, because of the heterogeneity of the processors, the WCET for each task of the DAG depends on which processor the task is executing on during actual runtime. Second, timing anomalies are the main obstacle to compute the makespan even for the simpler case when all the processors are of the same type, i.e., homogeneous multiprocessors. To that end, this thesis addresses the following problem: How we can schedule multiple sporadic DAGs on unrelated multiprocessors such that all the DAGs meet their deadlines. Initially, the thesis focuses on homogeneous multiprocessors that is a special case of unrelated multiprocessors to understand and tackle the main challenge of timing anomalies. A novel timing-anomaly-free scheduler is proposed which can be used to compute the makespan of a DAG just by simulating the execution of the tasks based on this proposed scheduler. A set of representative task-based parallel OpenMP applications from the BOTS benchmark suite are modeled as DAGs to investigate the timing behavior of real-world applications. A simulation framework is developed to evaluate the proposed method. Furthermore, the thesis targets unrelated multiprocessors and proposes a global scheduler to execute the tasks of a single DAG to an unrelated multiprocessors platform. Based on the proposed scheduler, methods to compute the makespan of a single DAG are introduced. A set of representative parallel applications from the BOTS benchmark suite are modeled as DAGs that execute on unrelated multiprocessors. Furthermore, synthetic DAGs are generated to examine additional structures of parallel applications and various platform capabilities. A simulation framework that simulates the execution of the tasks of a DAG on an unrelated multiprocessor platform is introduced to assess the effectiveness of the proposed makespan computations. Finally, based on the makespan computation of a single DAG this thesis presents the design and schedulability analysis of global and federated scheduling of sporadic DAGs that execute on unrelated multiprocessors

Chalmers Research

Optimal processor assignment for pipeline computations

Author: Choudhury Alok N.
Narahari Bhagirath
Nicol David M.
Simha Rahul
Publication venue
Publication date: 01/01/1991
Field of study

The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered

NASA Technical Reports Server

Syracuse University Research Facility and Collaborative Environment

Mapping and Scheduling of Directed Acyclic Graphs on An FPFA Tile

Author: Guo Y.
Smit G.J.M.
Publication venue: STW Technology Foundation
Publication date: 01/01/2002
Field of study

An architecture for a hand-held multimedia device requires components that are energy-efficient, flexible, and provide high performance. In the CHAMELEON [4] project we develop a coarse grained reconfigurable device for DSP-like algorithms, the so-called Field Programmable Function Array (FPFA). The FPFA devices are reminiscent to FPGAs, but with a matrix of Processing Parts (PP) instead of CLBs. The design of the FPFA focuses on: (1) Keeping each PP small to maximize the number of PPs that can fit on a chip; (2) providing sufficient flexibility; (3) Low energy consumption; (4) Exploiting the maximum amount of parallelism; (5) A strong support tool for FPFA-based applications. The challenge in providing compiler support for the FPFA-based design stems from the flexibility of the FPFA structure. If we do not use the characteristics of the FPFA structure properly, the advantages of an FPFA may become its disadvantages. The GECKO1project focuses on this problem. In this paper, we present a mapping and scheduling scheme for applications running on one FPFA tile. Applications are written in C and C code is translated to a Directed Acyclic Graphs (DAG) [4]. This scheme can map a DAG directly onto the reconfigurable PPs of an FPFA tile. It tries to achieve low power consumption by exploiting locality of reference and high performance by exploiting maximum parallelism

University of Twente Research Information

Adaptive Energy-aware Scheduling of Dynamic Event Analytics across Edge and Cloud Resources

Author: Ghosh Rajrup
Komma Siva Prakash Reddy
Simmhan Yogesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/01/2018
Field of study

The growing deployment of sensors as part of Internet of Things (IoT) is generating thousands of event streams. Complex Event Processing (CEP) queries offer a useful paradigm for rapid decision-making over such data sources. While often centralized in the Cloud, the deployment of capable edge devices on the field motivates the need for cooperative event analytics that span Edge and Cloud computing. Here, we identify a novel problem of query placement on edge and Cloud resources for dynamically arriving and departing analytic dataflows. We define this as an optimization problem to minimize the total makespan for all event analytics, while meeting energy and compute constraints of the resources. We propose 4 adaptive heuristics and 3 rebalancing strategies for such dynamic dataflows, and validate them using detailed simulations for 100 - 1000 edge devices and VMs. The results show that our heuristics offer O(seconds) planning time, give a valid and high quality solution in all cases, and reduce the number of query migrations. Furthermore, rebalance strategies when applied in these heuristics have significantly reduced the makespan by around 20 - 25%.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref