58 research outputs found
Efficient computation of aerodynamic influence coefficients for aeroelastic analysis on a transputer network
Aeroelastic analysis is multi-disciplinary and computationally expensive. Hence, it can greatly benefit from parallel processing. As part of an effort to develop an aeroelastic capability on a distributed memory transputer network, a parallel algorithm for the computation of aerodynamic influence coefficients is implemented on a network of 32 transputers. The aerodynamic influence coefficients are calculated using a 3-D unsteady aerodynamic model and a parallel discretization. Efficiencies up to 85 percent were demonstrated using 32 processors. The effect of subtask ordering, problem size, and network topology are presented. A comparison to results on a shared memory computer indicates that higher speedup is achieved on the distributed memory system
Circuit simulation using distributed waveform relaxation techniques
Simulation plays an important role in the design of integrated circuits. Due to high costs and large delays involved in their fabrication, simulation is commonly used to verify functionality and to predict performance before fabrication. This thesis describes analysis, implementation and performance evaluation of a distributed memory parallel waveform relaxation technique for the electrical circuit simulation of MOS VLSI circuits. The waveform relaxation technique exhibits inherent parallelism due to the partitioning of a circuit into a number of sub-circuits. These subcircuits can be concurrently simulated on parallel processors. Different forms of parallelism in the direct method and the waveform relaxation technique are studied. An analysis of single queue and distributed queue approaches to implement parallel waveform relaxation on distributed memory machines is performed and their performance implications are studied. The distributed queue approach selected for exploiting the coarse grain parallelism across sub-circuits is described. Parallel waveform relaxation programs based on Gauss-Seidel and Gauss-Jacobi techniques are implemented using a network of eight Transputers. Static and dynamic load balancing strategies are studied. A dynamic load balancing algorithm is developed and implemented. Results of parallel implementation are analyzed to identify sources of bottlenecks. This thesis has demonstrated the applicability of a low cost distributed memory multi-computer system for simulation of MOS VLSI circuits. Speed-up measurements prove that a five times improvement in the speed of calculations can be achieved using a full window parallel Gauss-Jacobi waveform relaxation algorithm. Analysis of overheads shows that load imbalance is the major source of overhead and that the fraction of the computation which must be performed sequentially is very low. Communication overhead depends on the nature of the parallel architecture and the design of communication mechanisms. The run-time environment (parallel processing framework) developed in this research exploits features of the Transputer architecture to reduce the effect of the communication overhead by effectively overlapping computation with communications, and running communications processes at a higher priority. This research will contribute to the development of low cost, high performance workstations for computer-aided design and analysis of VLSI circuits
Implementing tuple space on transputer meshes
Research Report submitted to the Faculty of Science, University of the Witwatersrand,
Johannesburg, towards a partial fulfilment of the requirements for the degree
of Master of Science
Johannesburg 1991This report describes and evaluates an implementation of the Linda tuple space abstraction
on Transputer networks. There is evidence that suggests a need for a new
programming methodology to support Transputer-based applications, and Linda, as
an attractive and elegant alternative to existing methodologies, has great potential
for this role. The research focuses on the implementation of a particular tuple space
model, intermediate uniform distribution, on Transputer meshes. The objective of
the research is to ascertain the extent of the communication overheads inherent in
the implementation and hence evaluate the feasibility of the approach. The overheads
are measured relative to message passing performance on native Transputer
networks, and are shown to be significant. It is concluded that although the specific
tuple space model is not ideally suited to Transputer-based systems and the implementation,
as it stands, is too inefficient to be of practical use, the approach requires
further exploration in order to exhaust its full research potential.MT201
Recommended from our members
A graph theoretic approach to transputer network design for computer vision
The work in this thesis is concerned with parallel architectures based on the Inmos transputer-type processors and parallelisation of some computer vision tasks chosen from low to high level.
The transputer is a microprocessor with a micro-programmed scheduler and four serial communication links. It directly supports parallel processing since several transputers can be connected through their links to co-operate on solving a problem. Also several processes can be run on the same transputer. A major issue in parallel processing is the communication overhead introduced by parallelising a given task. This overhead is not present in sequential processing and must be curbed if the implementation of a task on a parallel machine is to be successful. The interconnection network underlying the architecture of a parallel computer is therefore of the utmost importance.
Computer Vision consists of a hierarchy of tasks ranging from low-level operations dealing with large amounts of relatively simple data to high level operations handling increasingly complex structures. In this work a novel edge detector based on adaptive filtering and an edge detector operating on colour images are presented and implemented on a number of transputers. These parallel implementations together with implementations of vector quantisation, Fourier descriptors for shape discrimination, the Hough transform and the Maximum clique algorithm, offer a notable performance increase when compared with sequential implementations. However, every algorithm required the design of a specific network of transputers to take advantage of the parallelism and data dependencies inherent in each.
Consequently, attention is focused on the topology of interconnection networks. In particular, the communication requirements of computer vision algorithms as identified by the various computer vision tasks are analysed. These requirements together with graph theoretical considerations are then used to suggest a topology for large transputer networks. The latter is based on sub-graphs, with proven performance when used to implement interconnection networks, combined to form an architecture with improved performance. This architecture consists of a fixed structure supplemented with a dynamically reconfigured network. After describing this topology, a routing algorithm that conveys messages along shortest paths in the network is given and implemented. And finally, some practical issues in the use of transputers are considered and solutions proposed
Non-perturbative field theories
SIGLEAvailable from British Library Document Supply Centre- DSC:D85083 / BLDSC - British Library Document Supply CentreGBUnited Kingdo
Effective interprocess communication (IPC) in a real-time transputer network
The thesis describes the design and implementation of an interprocess communication (IPC)
mechanism within a real-time distributed operating system kernel (RT-DOS) which is
designed for a transputer-based network. The requirements of real-time operating systems
are examined and existing design and implementation strategies are described. Particular
attention is paid to one of the object-oriented techniques although it is concluded that these
techniques are not feasible for the chosen implementation platform. Studies of a number of
existing operating systems are reported. The choices for various aspects of operating system
design and their influence on the IPC mechanism to be used are elucidated. The actual design
choices are related to the real-time requirements and the implementation that has been
adopted is described. [Continues.
Parallel process placement
This thesis investigates methods of automatic allocation of processes to available processors in a given network configuration. The research described covers the investigation of various algorithms for optimal process allocation. Among those researched were an algorithm which used a branch and bound technique, an algorithm based on graph theory, and an heuristic algorithm involving cluster analysis. These have been implemented and tested in conjunction with the gathering of performance statistics during program execution, for use in improving subsequent allocations. The system has been implemented on a network of loosely-coupled microcomputers using multi-port serial communication links to simulate a transputer network. The concurrent programming language occam has been implemented, replacing the explicit process allocation constructs with an automatic placement algorithm. This enables the source code to be completely separated from hardware consideration
- …