21,410 research outputs found

    A multistage linear array assignment problem

    Get PDF
    The implementation of certain algorithms on parallel processing computing architectures can involve partitioning contiguous elements into a fixed number of groups, each of which is to be handled by a single processor. It is desired to find an assignment of elements to processors that minimizes the sum of the maximum workloads experienced at each stage. This problem can be viewed as a multi-objective network optimization problem. Polynomially-bounded algorithms are developed for the case of two stages, whereas the associated decision problem (for an arbitrary number of stages) is shown to be NP-complete. Heuristic procedures are therefore proposed and analyzed for the general problem. Computational experience with one of the exact problems, incorporating certain pruning rules, is presented with one of the exact problems. Empirical results also demonstrate that one of the heuristic procedures is especially effective in practice

    Performance tradeoffs in static and dynamic load balancing strategies

    Get PDF
    The problem of uniformly distributing the load of a parallel program over a multiprocessor system was considered. A program was analyzed whose structure permits the computation of the optimal static solution. Then four strategies for load balancing were described and their performance compared. The strategies are: (1) the optimal static assignment algorithm which is guaranteed to yield the best static solution, (2) the static binary dissection method which is very fast but sub-optimal, (3) the greedy algorithm, a static fully polynomial time approximation scheme, which estimates the optimal solution to arbitrary accuracy, and (4) the predictive dynamic load balancing heuristic which uses information on the precedence relationships within the program and outperforms any of the static methods. It is also shown that the overhead incurred by the dynamic heuristic is reduced considerably if it is started off with a static assignment provided by either of the other three strategies

    Optimal processor assignment for pipeline computations

    Get PDF
    The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered

    Rectilinear partitioning of irregular data parallel computations

    Get PDF
    New mapping algorithms for domain oriented data-parallel computations, where the workload is distributed irregularly throughout the domain, but exhibits localized communication patterns are described. Researchers consider the problem of partitioning the domain for parallel processing in such a way that the workload on the most heavily loaded processor is minimized, subject to the constraint that the partition be perfectly rectilinear. Rectilinear partitions are useful on architectures that have a fast local mesh network. Discussed here is an improved algorithm for finding the optimal partitioning in one dimension, new algorithms for partitioning in two dimensions, and optimal partitioning in three dimensions. The application of these algorithms to real problems are discussed

    Approximate algorithms for partitioning and assignment problems

    Get PDF
    The problem of optimally assigning the modules of a parallel/pipelined program over the processors of a multiple computer system under certain restrictions on the interconnection structure of the program as well as the multiple computer system was considered. For a variety of such programs it is possible to find linear time if a partition of the program exists in which the load on any processor is within a certain bound. This method, when combined with a binary search over a finite range, provides an approximate solution to the partitioning problem. The specific problems considered were: a chain structured parallel program over a chain-like computer system, multiple chain-like programs over a host-satellite system, and a tree structured parallel program over a host-satellite system. For a problem with m modules and n processors, the complexity of the algorithm is no worse than O(mnlog(W sub T/epsilon)), where W sub T is the cost of assigning all modules to one processor and epsilon the desired accuracy

    Neural-network dedicated processor for solving competitive assignment problems

    Get PDF
    A neural-network processor for solving first-order competitive assignment problems consists of a matrix of N x M processing units, each of which corresponds to the pairing of a first number of elements of (R sub i) with a second number of elements (C sub j), wherein limits of the first number are programmed in row control superneurons, and limits of the second number are programmed in column superneurons as MIN and MAX values. The cost (weight) W sub ij of the pairings is programmed separately into each PU. For each row and column of PU's, a dedicated constraint superneuron insures that the number of active neurons within the associated row or column fall within a specified range. Annealing is provided by gradually increasing the PU gain for each row and column or increasing positive feedback to each PU, the latter being effective to increase hysteresis of each PU or by combining both of these techniques
    • …
    corecore