77 research outputs found

    Towards a portable occam

    Get PDF
    Occam is designed for concurrent programming on a network of transputers. AIlocation and partitioning of the program is specified within the source code, binding the program to a specific network. An altemative approach is proposed which completely separates the source code from hardware considerations. Static allocation is performed as a separate phase and should, ideally, be automatic but at present is manual. Complete hardware abstraction requires that non-local, shared communication be provided for, introducing an efficiency overhead which can be minimised by the allocation. The proposal was implemented on a network of IBM PCs, modelled on a transputer network, and implementation issues are discusse

    Simulation and analysis of adaptive routing and flow control in wide area communication networks

    Get PDF
    This thesis presents the development of new simulation and analytic models for the performance analysis of wide area communication networks. The models are used to analyse adaptive routing and flow control in fully connected circuit switched and sparsely connected packet switched networks. In particular the performance of routing algorithms derived from the L(_R-I) linear learning automata model are assessed for both types of network. A novel architecture using the INMOS Transputer is constructed for simulation of both circuit and packet switched networks in a loosely coupled multi- microprocessor environment. The network topology is mapped onto an identically configured array of processing centres to overcome the processing bottleneck of conventional Von Neumann architecture machines. Previous analytic work in circuit switched work is extended to include both asymmetrical networks and adaptive routing policies. In the analysis of packet switched networks analytic models of adaptive routing and flow control are integrated to produce a powerful, integrated environment for performance analysis The work concludes that routing algorithms based on linear learning automata have significant potential in both fully connected circuit switched networks and sparsely connected packet switched networks

    Aspects of parallel processing and control engineering

    Get PDF
    The concept of parallel processing is not a new one, but the application of it to control engineering tasks is a relatively recent development, made possible by contemporary hardware and software innovation. It has long been accepted that, if properly orchestrated several processors/CPUs when combined can form a powerful processing entity. What prevented this from being implemented in commercial systems was the adequacy of the microprocessor for most tasks and hence the expense of a multi-processor system was not justified. With the advent of high demand systems, such as highly fault tolerant flight controllers and fast robotic controllers, parallel processing became a viable option. Nonetheless, the software interfacing of control laws onto parallel systems has remained somewhat of an impasse. There are no software compilers at present which allow a programmer to specify a control law in pure mathematical terminology and then decompose it into a flow diagram of concurrent processes which may then be implemented on, say, a target Transputer system, liiere are several parallel programming languages with which a programmer can generate parallel processes but, generally, in order to realise a control algorithm in parallel the programmer must have intimate knowledge of the algorithm. Therefore, efficiency is based on the ability of the programmer to recognise inherent parellelism. Some attempts are being made to create intelligent partition and scheduling compilers but this usually means significantly extra overheads on the multiprocessor system. In the absence of an automated technique control algorithms must be decomposed by inspection. The research presented in this thesis is founded upon the application of both parallel and pipelining techniques to particular control strategies. Parallelism is tackled objectively and by creating a tailored terminology it is defined mathematically, and consequently related concepts, such as bounded parallelism and algorithm speedup, are also quantified in a numerical sense. A pipelined explicit Self Tuning Regulator (STR) controller is developed and tested on systems of different order. Under the governance of the parallelism terminology the effectiveness of the parallel STR is evaluated and numerically quantified in terms of relevant performance indices. A parallel simulator is presented for the Puma 560 robotic manipulator. By exploiting parallelism and pipelinability in the robot model a significant increase in execution speed is achieved over the sequential model. The use of Transputers is examined and graphical results obtained for several performance indices, including speedup, processor efficiency and bounded parallelism. By the same analytical technique a parallel computed torque feedforward controller incorporating proportional derivative feedback control for the Puma 560 manipulator is developed and appraised. The performance of a Transputer system in hosting the controller is graphically analysed and as in the case of the parallel simulator the more important performance indices are examined under both optimal conditions and conditions of varying hardware constraints

    Hardware and software aspects of parallel computing

    Get PDF
    Part 1 (Chapters 2,3 and 4) is concerned with the development of hardware for multiprocessor systems. Some of the concepts used in digital hardware design are introduced in Chapter 2. These include the fundamentals of digital electronics such as logic gates and flip-flops as well as the more complicated topics of rom and programmable logic. It is often desirable to change the network topology of a multiprocessor machine to suit a particular application. The third chapter describes a circuit switching scheme that allows the user to alter the network topology prior to computation. To achieve this, crossbar switches are connected to the nodes, and the host processor (a PC) programs the crossbar switches to make the desired connections between the nodes. The hardware and software required for this system is described in detail. Whilst this design allows the topology of a multiprocessor system to be altered prior to computation, the topology is still fixed during program run-time. Chapter 4 presents a system that allows the topology to be altered during run-time. The nodes send connection requests to a control processor which programs a crossbar switch connected to the nodes. This system allows every node in a parallel computer to communicate directly with every other node. The hardware interface between the nodes and the control processor is discussed in detail, and the software on the control processor is also described. Part 2 (Chapters 5 and 6) of this thesis is concerned with the parallelisation of a large molecular mechanics program. Chapter 5 describes the fundamentals of molecular mechanics such as the steric energy equation and its components, force field parameterisation and energy minimisation. The implementation of a novel programming (COMFORT) and hardware (the BB08) environment into a parallel molecular mechanics (MM) program is presented in Chapter 6. The structure of the sequential version of the MM program is detailed, before discussing the implementation of the parallel version using COMFORT and the BB08

    Reducing Communication Delay Variability for a Group of Robots

    Get PDF
    A novel architecture is presented for reducing communication delay variability for a group of robots. This architecture relies on using three components: a microprocessor architecture that allows deterministic real-time tasks; an event-based communication protocol in which nodes transmit in a TDMA fashion, without the need of global clock synchronization techniques; and a novel communication scheme that enables deterministic communications by allowing senders to transmit without regard for the state of the medium or coordination with other senders, and receivers can tease apart messages sent simultaneously with a high probability of success. This approach compared to others, allows simultaneous communications without regard for the state of the transmission medium, it allows deterministic communications, and it enables ordered communications that can be a applied in a team of robots. Simulations and experimental results are also included

    Design and resource management of reconfigurable multiprocessors for data-parallel applications

    Get PDF
    FPGA (Field-Programmable Gate Array)-based custom reconfigurable computing machines have established themselves as low-cost and low-risk alternatives to ASIC (Application-Specific Integrated Circuit) implementations and general-purpose microprocessors in accelerating a wide range of computation-intensive applications. Most often they are Application Specific Programmable Circuiits (ASPCs), which are developer programmable instead of user programmable. The major disadvantages of ASPCs are minimal programmability, and significant time and energy overheads caused by required hardware reconfiguration when the problem size outnumbers the available reconfigurable resources; these problems are expected to become more serious with increases in the FPGA chip size. On the other hand, dominant high-performance computing systems, such as PC clusters and SMPs (Symmetric Multiprocessors), suffer from high communication latencies and/or scalability problems. This research introduces low-cost, user-programmable and reconfigurable MultiProcessor-on-a-Programmable-Chip (MPoPC) systems for high-performance, low-cost computing. It also proposes a relevant resource management framework that deals with performance, power consumption and energy issues. These semi-customized systems reduce significantly runtime device reconfiguration by employing userprogrammable processing elements that are reusable for different tasks in large, complex applications. For the sake of illustration, two different types of MPoPCs with hardware FPUs (floating-point units) are designed and implemented for credible performance evaluation and modeling: the coarse-grain MIMD (Multiple-Instruction, Multiple-Data) CG-MPoPC machine based on a processor IP (Intellectual Property) core and the mixed-mode (MIMD, SIMD or M-SIMD) variant-grain HERA (HEterogeneous Reconfigurable Architecture) machine. In addition to alleviating the above difficulties, MPoPCs can offer several performance and energy advantages to our data-parallel applications when compared to ASPCs; they are simpler and more scalable, and have less verification time and cost. Various common computation-intensive benchmark algorithms, such as matrix-matrix multiplication (MMM) and LU factorization, are studied and their parallel solutions are shown for the two MPoPCs. The performance is evaluated with large sparse real-world matrices primarily from power engineering. We expect even further performance gains on MPoPCs in the near future by employing ever improving FPGAs. The innovative nature of this work has the potential to guide research in this arising field of high-performance, low-cost reconfigurable computing. The largest advantage of reconfigurable logic lies in its large degree of hardware customization and reconfiguration which allows reusing the resources to match the computation and communication needs of applications. Therefore, a major effort in the presented design methodology for mixed-mode MPoPCs, like HERA, is devoted to effective resource management. A two-phase approach is applied. A mixed-mode weighted Task Flow Graph (w-TFG) is first constructed for any given application, where tasks are classified according to their most appropriate computing mode (e.g., SIMD or MIMD). At compile time, an architecture is customized and synthesized for the TFG using an Integer Linear Programming (ILP) formulation and a parameterized hardware component library. Various run-time scheduling schemes with different performanceenergy objectives are proposed. A system-level energy model for HERA, which is based on low-level implementation data and run-time statistics, is proposed to guide performance-energy trade-off decisions. A parallel power flow analysis technique based on Newton\u27s method is proposed and employed to verify the methodology
    • …
    corecore