1,650 research outputs found
Recommended from our members
A mapping strategy for MIMD computers
In this paper, a heuristic mapping approach which maps parallel programs, described by precedence graphs, to MIMD architectures, described by system graphs, is presented. The complete execution time of a parallel program is used as a measure, and the concept of critical edges is utilized as the heuristic to guide the search for a better initial assignment and subsequent refinement. An important feature is the use of a termination condition of the refinement process. This is based on deriving a lower bound on the total execution time of the mapped program. When this has been reached, no further refinement steps are necessary. The algorithms have been implemented and applied to the mapping of random problem graphs to various system topologies, including hypercubes, meshes, and random graphs. The results show reductions in execution times of the mapped programs of up to 77 percent over random mapping
Architecture independent environment for developing engineering software on MIMD computers
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management
System software for the finite element machine
The Finite Element Machine is an experimental parallel computer developed at Langley Research Center to investigate the application of concurrent processing to structural engineering analysis. This report describes system-level software which has been developed to facilitate use of the machine by applications researchers. The overall software design is outlined, and several important parallel processing issues are discussed in detail, including processor management, communication, synchronization, and input/output. Based on experience using the system, the hardware architecture and software design are critiqued, and areas for further work are suggested
Highly parallel computation
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed
Design, development and use of the finite element machine
Some of the considerations that went into the design of the Finite Element Machine, a research asynchronous parallel computer are described. The present status of the system is also discussed along with some indication of the type of results that were obtained
Massive parallelism in the future of science
Massive parallelism appears in three domains of action of concern to scientists, where it produces collective action that is not possible from any individual agent's behavior. In the domain of data parallelism, computers comprising very large numbers of processing agents, one for each data item in the result will be designed. These agents collectively can solve problems thousands of times faster than current supercomputers. In the domain of distributed parallelism, computations comprising large numbers of resource attached to the world network will be designed. The network will support computations far beyond the power of any one machine. In the domain of people parallelism collaborations among large groups of scientists around the world who participate in projects that endure well past the sojourns of individuals within them will be designed. Computing and telecommunications technology will support the large, long projects that will characterize big science by the turn of the century. Scientists must become masters in these three domains during the coming decade
FFT for the APE Parallel Computer
We present a parallel FFT algorithm for SIMD systems following the `Transpose
Algorithm' approach. The method is based on the assignment of the data field
onto a 1-dimensional ring of systolic cells. The systolic array can be
universally mapped onto any parallel system. In particular for systems with
next-neighbour connectivity our method has the potential to improve the
efficiency of matrix transposition by use of hyper-systolic communication. We
have realized a scalable parallel FFT on the APE100/Quadrics massively parallel
computer, where our implementation is part of a 2-dimensional hydrodynamics
code for turbulence studies. A possible generalization to 4-dimensional FFT is
presented, having in mind QCD applications.Comment: 17 pages, 13 figures, figures include
A sweep algorithm for massively parallel simulation of circuit-switched networks
A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude
Divide-and-conquer algorithms for multiprocessors
During the past decade there has been a tremendous surge in understanding the nature of parallel computation. A number of parallel computers are commercially available. However, there are some problems in developing application programs on these computers;This dissertation considers various issues involved in implementing parallel algorithms on Multiple Instruction Multiple Data (MIMD) machines with a bounded number of processors. Strategies for implementing divide-and-conquer algorithms on MIMD machines are proposed. Results linking time complexity, communication complexity and the complexity of divide-and-combine functions of divide-and-conquer algorithms are analyzed. An efficient criterion for partitioning a parallel program is proposed and a method for obtaining a closed form expression for time complexity of a parallel program in terms of problem size and number of processors is developed
- …