1 research outputs found

    Efficient execution of homogeneous tasks with unequal run times on the Connection Machine

    No full text
    Many scientific applications require the execution of a large number of identical tasks, each on a different data set. Such applications can easily benefit from the power of SIMD architectures (e.g. the Connection Machine) byhaving the array of processing elements (PEs) execute the task in parallel on the different data sets. It is often the case, however, that the task to be performed involves the repetitive application of the same sequence of steps, a body, for a number of times that depend on the input or computed data. If the usual task-level synchronization is used, the utilization of the array of PEs degrades substantially. In this paper, we propose a body-level synchronization scheme that would boost the utilization of the array of PEs while keeping the required overhead to a minimum. We mathematically analyze the proposed technique and show how to optimize its performance for a given application. Our technique is particularily efficient when the number of tasks to be executed is much larger than the number of physical PEs available
    corecore