208 research outputs found

    Principles for problem aggregation and assignment in medium scale multiprocessors

    Get PDF
    One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior

    Analysis of Various Decentralized Load Balancing Techniques with Node Duplication

    Get PDF
    Experience in parallel computing is an increasingly necessary skill for today’s upcoming computer scientists as processors are hitting a serial execution performance barrier and turning to parallel execution for continued gains. The uniprocessor system has now reached its maximum speed limit and, there is very less scope to improve the speed of such type of system. To solve this problem multiprocessor system is used, which have more than one processor. Multiprocessor system improves the speed of the system but it again faces some problems like data dependency, control dependency, resource dependency and improper load balancing. So this paper presents a detailed analysis of various decentralized load balancing techniques with node duplication to reduce the proper execution time

    Parallel Computers and Complex Systems

    Get PDF
    We present an overview of the state of the art and future trends in high performance parallel and distributed computing, and discuss techniques for using such computers in the simulation of complex problems in computational science. The use of high performance parallel computers can help improve our understanding of complex systems, and the converse is also true --- we can apply techniques used for the study of complex systems to improve our understanding of parallel computing. We consider parallel computing as the mapping of one complex system --- typically a model of the world --- into another complex system --- the parallel computer. We study static, dynamic, spatial and temporal properties of both the complex systems and the map between them. The result is a better understanding of which computer architectures are good for which problems, and of software structure, automatic partitioning of data, and the performance of parallel machines

    Semi-Distributed Load Balancing for Massively Parallel Multicomputer Systems

    Get PDF
    This paper presents a semi-distributed approach, for load balancing in large parallel and distributed systems, which is different from the conventional centralized and fully distributed approaches. The proposed strategy uses a two-level hierarchical control by partitioning the interconnection structure of a distributed or multiprocessor system into independent symmetric regions (spheres) centered at some control points. The central points, called schedulers, optimally schedule tasks within their spheres and maintain state information with low overhead. We consider interconnection structures belonging to a number of families of distance transitive graphs for evaluation, and using their algebraic characteristics, show that identification of spheres and their scheduling points is, in general, an NP-complete problem. An efficient solution for this problem is presented by making an exclusive use of a combinatorial structure known as the Hadamard Matrix. Performance of the proposed strategy has been evaluated and compared with an efficient fully distributed strategy, through an extensive simulation study. In addition to yielding high performance in terms of response time and better resource utilization, the proposed strategy incurs less overhead in terms of control messages. It is also shown to be less sensitive to the communication delay of the underlying network

    LU Factorization of Sparse, Unsymmetric Jacobian Matrices on Multicomputers: Experience, Strategies, Performance

    Get PDF
    Efficient sparse linear algebra cannot be achieved as a straightforward extension of the dense case, even for concurrent implementations. This paper details a new, general-purpose unsymmetric sparse LU factorization code built on the philosophy of Harwell’s MA28, with variations. We apply this code in the framework of Jacobian-matrix factorizations, arising from Newton iterations in the solution of nonlinear systems of equations. Serious attention has been paid to the data-structure requirements, complexity issues and communication features of the algorithm. Key results include reduced communication pivoting for both the “analyze” A-mode and repeated B-mode factorizations, and effective general-purpose data distributions useful incrementally to trade-off process-column load balance in factorization against triangular solve performance. Future planned efforts are cited in conclusion

    Parallel rendering algorithms for distributed-memory multicomputers

    Get PDF
    Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1997.Thesis (Ph. D.) -- Bilkent University, 1997.Includes bibliographical references leaves 166-176.Kurç, Tahsin MertefePh.D

    The Use of Parallel Processing in VLSI Computer-Aided Design Application

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratorySemiconductor Research Corporation / 87-DP-10
    corecore