2 research outputs found

    Exploiting partial replication in unbalanced parallel loop scheduling on multicomputer.

    No full text
    We consider the problem of scheduling parallel loops whose iterations operate on large array data structures and are characterized by highly varying execution times (unbalanced or non-uniform parallel loops). A general parallel loop implementation template for message-passing distributed-memory multiprocessors (multicomputers) is presented. Assuming that it is impossible to statically determine the distribution of the computational load on the data accessed, the template exploits a hybrid scheduling strategy. The data are partially replicated on the processor's local memories and iterations are statically scheduled until first load imbalances are detected. At this point an effective dynamic scheduling technique is adopted to move iterations among nodes holding the same data. Most of the communications needed to implement dynamic load balancing are overlapped with computations, as a very effective prefetching policy is adopted. The template scales very well, since knowing where data are..
    corecore