12 research outputs found
A Practical Hierarchial Model of Parallel Computation: The Model
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a sub-model, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hierarchical PRAM (H-PRAM) model controls conceptual complexity in the face of asynchrony in two ways. First, by providing the simplifying assumption of synchronization to the design of algorithms, but allowing the algorithms to work asynchronously with each other; and organizing this control asynchrony via an implicit hierarchy relation. Second, by allowing the restriction of communication asynchrony in order to obtain determinate algorithms (thus greatly simplifying proofs of correctness). It is shown that the model is reflective of a variety of existing and proposed parallel architectures, particularly ones that can support massive parallelism. Relationships to programming languages are discussed. Since the PRAM is a sub-model, we can use PRAM algorithms as sub-algorithms in algorithms for the H-PRAM; thus results that have been established with respect to the PRAM are potentially transferable to this new model. The H-PRAM can be used as a flexible tool to investigate general degrees of locality (“neighborhoods of activity) in problems, considering communication and synchronization simultaneously. This gives the potential of obtaining algorithms that map more efficiently to architectures, and of increasing the number of processors that can efficiently be used on a problem (in comparison to a PRAM that charges for communication and synchronization). The model presents a framework in which to study the extent that general locality can be exploited in parallel computing. A companion paper demonstrates the usage of the H-PRAM via the design and analysis of various algorithms for computing the complete binary tree and the FFT/butterfly graph
Techniques for power system simulation using multiple processors
The thesis describes development work which was undertaken to improve the speed of a real-time power system simulator used for the development and testing of control schemes. The solution of large, highly sparse matrices was targeted because this is the most time-consuming part of the current simulator. Major improvements in the speed of the matrix ordering phase of the solution were achieved through the development of a new ordering strategy. This was thoroughly investigated, and is shown to provide important additional improvements compared to standard ordering methods, in reducing path length and minimising potential pipeline stalls. Alterations were made to the remainder of the solution process which provided more flexibility in scheduling calculations. This was used to dramatically ease the run-time generation of efficient code, dedicated to the solution of one matrix structure, and also to reduce memory requirements. A survey of the available microprocessors was performed, which concluded that a special-purpose design could best implement the code generated at run-time, and a design was produced using a microprogrammable floating-point processor, which matched the code produced by the earlier work. A method of splitting the matrix solution onto parallel processors was investigated, and two methods of producing network splits were developed and their results compared. The best results from each method were found to agree well, with a predicted three-fold speed-up for the matrix solution of the C.E.G.B. transmission system from the use of six processors. This gain will increase for the whole simulator. A parallel processing topology of the partitioned network and produce the necessary structures for the remainder of the solution process
Queueing networks: solutions and applications
During the pasttwo decades queueing network models have proven to be a versatile tool for computer system and computer communication system performance evaluation. This chapter provides a survey of th field with a particular emphasis on applications. We start with a brief historical retrospective which also servesto introduce the majr issues and application areas. Formal results for product form queuenig networks are reviewed with particular emphasis on the implications for computer systems modeling. Computation algorithms, sensitivity analysis and optimization techniques are among the topics covered. Many of the important applicationsof queueing networks are not amenableto exact analysis and an (often confusing) array of approximation methods have been developed over the years. A taxonomy of approximation methods is given and used as the basis for for surveing the major approximation methods that have been studied. The application of queueing network to a number of areas is surveyed, including computer system cpacity planning, packet switching networks, parallel processing, database systems and availability modeling.Durante as últimas duas décadas modelos de redes de filas provaram ser uma ferramenta versátil para avaliação de desempenho de sistemas de computação e sistemas de comunicação. Este capítulo faz um apanhado geral da área, com ênfase em aplicações. Começamos com uma breve retrospectiva histórica que serve também para introduzir os pontos mais importantes e as áreas de aplicação. Resultados formais para redes de filas em forma de produto são revisados com ênfase na modelagem de sistemas de computação. Algoritmos de computação, análise de sensibilidade e técnicas de otimização estão entre os tópicos revistos. Muitas dentre importantes aplicações de redes de filas não são tratáveis por análise exata e uma série (frequentemente confusa) de métodos de aproximação tem sido desenvolvida. Uma taxonomia de métodos de aproximação é dada e usada como base para revisão dos mais importantes métodos de aproximação propostos. Uma revisão das aplicações de redes de filas em um número de áreas é feita, incluindo planejamento de capacidade de sistemas de computação, redes de comunicação por chaveamento de pacotes, processamento paralelo, sistemas de bancos de dados e modelagem de confiabilidade
The connection machine
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D
Distributed Operating Systems
Distributed operating systems have many aspects in common with centralized ones, but they also differ in certain ways. This paper is intended as an introduction to distributed operating systems, and especially to current university research about them. After a discussion of what constitutes a distributed operating system and how it is distinguished from a computer network, various key design issues are discussed. Then several examples of current research projects are examined in some detail, namely, the Cambridge Distributed Computing System, Amoeba, V, and Eden. © 1985, ACM. All rights reserved
A characterization of parallel systems
technical reporta taxonomy for parallel processing systems is presented which has some advantages over previous taxonomies. The taxonomy characterizes parallel processing systems using four parameters: topology, communication, granularity, and operation. These parameters and used repetitively in a hierarchical fashion to produce a taxonomic structure which is extensible to the level of detail desired. Topology describes the structure of the priniciple interconnections. Communication describes the flow of data and programs through the system. Granularity describes the size of the largest repeated element, or grain. Operation describes the important functional properties of each grain, especially the ratio of storage to logic circuitry. Granularity and topology are structural parameters, while operation and communication are functional parameters which describe the behavior of the system components. A final section of this paper includes examples of the application of the taxonomy to several parallel processing systems
A computing structure for data acquisition in high energy physics
A review of the development of parallel computing ispresented, followed by a summary of currently recognised typesof parallel computer and a brief summary of some applicationsof parallel computing in the field of high energy physics.The computing requirement at the data acquisition stageof a particular set of high energy physics experiments isdetailed, with reference to the computing system currently inuse. The requirement for a parallel processor to process thedata from these experiments is established and a possiblecomputing structure put forward.The topology proposed consists of a set of rings ofprocessors stacked to give a cylindrical arrangement, ananalytical approach is used to verify the suitability andextensibility of the suggested scheme. Using simulationresults the behaviour of rings and cylinders of processorsusing different algorithms for the movement of data within thesystem and different patterns of data input is presented anddiscussed.Practical hardware and software details for processingequipment capable of supporting such a structure as presentedhere is given, various algorithms for use with this equipment,e. g. program distribution, are developed and the software forthe implementation of the cylindrical structure is presented.Appendices of constructional information and all programlistings are included