5 research outputs found
Recommended from our members
Piece-wise scheduling of composite task graphs onto distributed memory parallel computers
Heuristics for static scheduling of task graphs using list scheduling techniques have continued to improve by adding real-world factors such as processor speed, network transmission speed, interconnection topology, and link contention considerations to the basic task graph model. Yet, the resulting schedules do not fully model program loops and branches, startup costs for both process creation and message initiation, and a number of interesting parallel processing patterns such as meshes, tress, and supervisor/workers. In fact, improvements in the schedule may be obtained when the task graph is regular as when it contains repeated or replicated tasks, divide-and-conquer patterns of communication, or a mesh-structured pattern of computation. In this paper we describe a limited approach to scheduling composite task graphs that considers process and message startup costs, and three regular patterns : replicated, tree, and mesh. The approach is to model programs with such regular patterns as a composite task graph, where each regular structure is a decomposable sub-task node in the task graph. Then, we compute an optimal schedule for each sub-task. graph, piece the sub-tasks together, and perform an ordinary static scheduling heuristic on the pieces, to produce an overall schedule. We define a composite task graph as a hierarchical task graph containing regular-structured sub-task graphs as components. At the top level of this hierarchy, each graph node represents either a simple task or a hierarchically decomposable sub-task graph. We propose a piece-wise scheduling algorithm that simply allocates processors to sub-task graphs according to closed-form expressions which give determine the optimal number of processors, and then uses a list scheduling algorithm to schedule the flattened graph onto these processors. We do not address the pressing problem of loops and branches in the task graph representation, but we speculate that the technique of piece-wise scheduling introduced here can be adapted to a hybrid form of scheduling that may accommodate branches and loops. Piece-wise scheduling is not guaranteed to yield the best global schedule. Rather, it pieces together locally optimum sub-schedules. Finding globally optimum schedules for composite task graphs remains an open problem. We present an heuristic approach that has been experimentally used to schedule small parallel programs with encouraging results. More empirical evidence is needed to determine the usefulness of this technique, but early indications are encouraging
The work/exchange model: a generalized approach to dynamic load balancing
A crucial concern in software development is reducing program execution time. Parallel processing is often used to meet this goal. However, parallel processing efforts can lead to many pitfalls and problems. One such problem is to distribute the workload among processors in such a way that minimum execution time is obtained. The common approach is to use a load balancer to distribute equal or nearly equal quantities of workload on each processor. Unfortunately, this approach relies on a naive definition of load imbalance and often fails to achieve the desired goal. A more sophisticated definition should account for the affects of additional factors including communication delay costs, network contention, and architectural issues. Consideration of additional factors led us to the realization that optimal load distribution does not always result from equal load distribution. In this dissertation, we tackle the difficult problem of defining load imbalance. This is accomplished through the development of a parallel program model called the Generalized Work/Exchange Model. Associated with the model are equations for a restricted set of deterministically balanced programs that characterize idle time, elapsed time, and potential speedup. With the aid of the model, several common myths about load imbalance are exposed. A useful application called a load balancer enhancer is also presented which is applicable to the more general, quasi-static load unbalanced program