32,781 research outputs found
Link contention-constrained scheduling and mapping of tasks and messages to a network of heterogeneous processors
In this paper, we consider the problem of scheduling and mapping precedence-constrained tasks to a network of heterogeneous processors. In such systems, processors are usually physically distributed, implying that the communication cost is considerably higher than in tightly coupled multiprocessors. Therefore, scheduling and mapping algorithms for such systems must schedule the tasks as well as the communication traffic by treating both the processors and communication links as important resources. We propose an algorithm that achieves these objectives and adapts its tasks scheduling and mapping decisions according to the given network topology. Just like tasks, messages are also scheduled and mapped to suitable links during the minimization of the finish times of tasks. Heterogeneity of processors is exploited by scheduling critical tasks to the fastest processors. Our extensive experimental study has demonstrated that the proposed algorithm is efficient, robust, and yields consistent performance over a wide range of scheduling parameters.published_or_final_versio
On exploiting task duplication in parallel program scheduling
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processor. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner and, based on these principles, propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which controls the degree of duplication according to the number of available processors. The proposed algorithms are analytically and experimentally evaluated and are also compared with the previous algorithms. © 1998 IEEE.published_or_final_versio
Performance driven distributed scheduling of parallel hybrid computations
AbstractExascale computing is fast becoming a mainstream research area. In order to realize exascale performance, it is necessary to have efficient scheduling of large parallel computations with scalable performance on a large number of cores/processors. The scheduler needs to execute in a pure distributed and online fashion, should follow affinity inherent in the computation and must have low time and message complexity. Further, it should also avoid physical deadlocks due to bounded resources including space/memory per core. Simultaneous consideration of these factors makes affinity driven distributed scheduling particularly challenging. We attempt to address this challenge for hybrid parallel computations which contain tasks that have pre-specified affinity to a place and also tasks that can be mapped to any place in the system. Specifically, we address two scheduling problems of the type Pm|Mj,prec|Cmax. This paper presents online distributed scheduling algorithms for hybrid parallel computations assuming both unconstrained and bounded space per place. We also present the time and message complexity for distributed scheduling of hybrid computations. To the best of our knowledge, this is the first time that distributed scheduling algorithms for hybrid parallel computations have been presented and analyzed for time and message bounds under both unconstrained space and bounded space
List Scheduling: The Price of Distribution
Classical list scheduling is a very popular and efficient technique for scheduling jobs in parallel and distributed platforms. It is inherently centralized. However, with the increasing number of processors in new parallel platforms, the cost for managing a single centralized list becomes too prohibitive. A suitable approach to reduce the contention is to distribute the list among the computational units. Thus each processor has only a local view of the work to execute. The objective of this work is to study the extra cost that must be paid when the list is distributed among the computational units. We present a general methodology for computing the expected makespan based on the analysis of an adequate potential function which represents the load unbalance between the local lists. It is applied to several scheduling problems, namely, for arbitrary divisible load, for unit independent tasks, for weighted independent tasks and for tasks with dependencies. It is presented in detail for the simplest case of divisible load, and then extended to the other cases
Planning and Resource Management in an Intelligent Automated Power Management System
Power system management is a process of guiding a power system towards the objective of continuous supply of electrical power to a set of loads. Spacecraft power system management requires planning and scheduling, since electrical power is a scarce resource in space. The automation of power system management for future spacecraft has been recognized as an important R&D goal. Several automation technologies have emerged including the use of expert systems for automating human problem solving capabilities such as rule based expert system for fault diagnosis and load scheduling. It is questionable whether current generation expert system technology is applicable for power system management in space. The objective of the ADEPTS (ADvanced Electrical Power management Techniques for Space systems) is to study new techniques for power management automation. These techniques involve integrating current expert system technology with that of parallel and distributed computing, as well as a distributed, object-oriented approach to software design. The focus of the current study is the integration of new procedures for automatically planning and scheduling loads with procedures for performing fault diagnosis and control. The objective is the concurrent execution of both sets of tasks on separate transputer processors, thus adding parallelism to the overall management process
Decentralized List Scheduling
Classical list scheduling is a very popular and efficient technique for
scheduling jobs in parallel and distributed platforms. It is inherently
centralized. However, with the increasing number of processors, the cost for
managing a single centralized list becomes too prohibitive. A suitable approach
to reduce the contention is to distribute the list among the computational
units: each processor has only a local view of the work to execute. Thus, the
scheduler is no longer greedy and standard performance guarantees are lost.
The objective of this work is to study the extra cost that must be paid when
the list is distributed among the computational units. We first present a
general methodology for computing the expected makespan based on the analysis
of an adequate potential function which represents the load unbalance between
the local lists. We obtain an equation on the evolution of the potential by
computing its expected decrease in one step of the schedule. Our main theorem
shows how to solve such equations to bound the makespan. Then, we apply this
method to several scheduling problems, namely, for unit independent tasks, for
weighted independent tasks and for tasks with precendence constraints. More
precisely, we prove that the time for scheduling a global workload W composed
of independent unit tasks on m processors is equal to W/m plus an additional
term proportional to log_2 W. We provide a lower bound which shows that this is
optimal up to a constant. This result is extended to the case of weighted
independent tasks. In the last setting, precedence task graphs, our analysis
leads to an improvement on the bound of Arora et al. We finally provide some
experiments using a simulator. The distribution of the makespan is shown to fit
existing probability laws. The additive term is shown by simulation to be
around 3 \log_2 W confirming the tightness of our analysis
- …