1 research outputs found
Recommended from our members
Scheduling non-uniform parallel loops on MIMD computers
Graduation date: 1994Parallel loops are one of the main sources of parallelism in scientific applications,\ud
and many parallel loops do not have a uniform iteration execution time. To\ud
achieve good performance for such applications on a parallel computer, iterations\ud
of a parallel loop have to be assigned to processors in such a way that each processor\ud
has roughly the same amount of work in terms of execution time. A parallel\ud
computer with a large number of processors tends to have distributed-memory. To\ud
run a parallel loop on a distributed-memory machine, data distribution also needs\ud
to be considered. This research investigates the scheduling of non-uniform parallel\ud
loops on both shared-memory and distributed-memory parallel computers.\ud
We present Safe Self-Scheduling (SSS), a new scheduling scheme that combines\ud
the advantages of both static and dynamic scheduling schemes. SSS has two\ud
phases: a static scheduling phase and a dynamic self-scheduling phase that together\ud
reduce the scheduling overhead while achieving a well balanced workload. The techniques\ud
introduced in SSS can be used by other self-scheduling schemes. The static\ud
scheduling phase further improves the performance by maintaining a high cache hit\ud
ratio resulting from increased affinity of iterations to processors. SSS is also very\ud
well suited for distributed-memory machines.\ud
We introduce methods to duplicate data on a number of processors. The\ud
methods eliminate data movement during computation and increase the scalability\ud
of problem size. We discuss a systematic approach to implement a given self-scheduling\ud
scheme on a distributed-memory. We also show a multilevel scheduling\ud
scheme to self-schedule parallel loops on a distributed-memory machine with a large\ud
number of processors to eliminate the bottleneck resulting from a central scheduler.\ud
We proposed a method using abstractions to automate both self-scheduling\ud
methods and data distribution methods in parallel programming environments. The\ud
abstractions are tested using CHARM, a real parallel programming environment.\ud
Methods are also developed to tolerate processor faults caused by both physical\ud
failure and reassignment of processors by the operating system during the execution\ud
of a parallel loop.\ud
We tested the techniques discussed using simulations and real applications.\ud
Good results have been obtained on both shared-memory and distributed-memory\ud
parallel computers