Search CORE

295 research outputs found

Learning to Plan Near-Optimal Collision-Free Paths

Author: Fox Geoffrey C.
Ho Alex W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/1990
Field of study

A new approach to find a near-optimal collision-free path is presented. The path planner is an implementation of the adaptive error back-propagation algorithm which learns to plan “good”, if not optimal, collision-free paths from human-supervised training samples. Path planning is formulated as a classification problem in which class labels are uniquely mapped onto the set of maneuverable actions of a robot or vehicle. A multi-scale representational scheme maps physical problem domains onto an arbitrarily chosen fixed size input layer of an error back-propagation network. The mapping does not only reduce the size of the computation domain, but also ensures applicability of a trained network over a wide range of problem sizes. Parallel implementation of the neural network path planner on hypercubes or Transputers based on Parasoft EXPRESS is simple and efficient, Simulation results of binary terrain navigation indicate that the planner performs effectively in unknown environment in the test cases

Caltech Authors

Recommended from our members

Solving large scale linear programming

Author: Hafsteinsson H
Levkovitz R
Mitra G
Publication venue: Brunel University
Publication date: 01/01/1993
Field of study

The interior point method (IPM) is now well established as a competitive technique for solving very large scale linear programming problems. The leading variant of the interior point method is the primal dual - predictor corrector algorithm due to Mehrotra. The main computational steps of this algorithm are the repeated calculation and solution of a large sparse positive definite system of equations. We describe an implementation of the predictor corrector IPM algorithm on MasPar, a massively parallel SIMD computer. At the heart of the implemen-tation is a parallel Cholesky factorization algorithm for sparse matrices. Our implementation uses a new scheme of mapping the matrix onto the processor grid of the MasPar, that results in a more efficient Cholesky factorization than previously suggested schemes. The IPM implementation uses the parallel unit of MasPar to speed up the factorization and other computationally intensive parts of the IPM. An impor-tant part of this implementation is the judicious division of data and computation between the front-end computer, that runs the main IPM algorithm, and the par-allel unit. Performanc

Brunel University Research Archive

Parallel algorithms for atmospheric modelling

Author: Tett Simon F. B.
Publication venue: The University of Edinburgh
Publication date: 01/01/1992
Field of study

Edinburgh Research Archive

Overview of Swallow --- A Scalable 480-core System for Investigating the Performance and Energy Efficiency of Many-core Applications and Operating Systems

Author: Hollis Simon J.
Kerrison Steve
Publication venue
Publication date: 23/04/2015
Field of study

We present Swallow, a scalable many-core architecture, with a current configuration of 480 x 32-bit processors. Swallow is an open-source architecture, designed from the ground up to deliver scalable increases in usable computational power to allow experimentation with many-core applications and the operating systems that support them. Scalability is enabled by the creation of a tile-able system with a low-latency interconnect, featuring an attractive communication-to-computation ratio and the use of a distributed memory configuration. We analyse the energy and computational and communication performances of Swallow. The system provides 240GIPS with each core consuming 71--193mW, dependent on workload. Power consumption per instruction is lower than almost all systems of comparable scale. We also show how the use of a distributed operating system (nOS) allows the easy creation of scalable software to exploit Swallow's potential. Finally, we show two use case studies: modelling neurons and the overlay of shared memory on a distributed memory system.Comment: An open source release of the Swallow system design and code will follow and references to these will be added at a later dat

arXiv.org e-Print Archive

Explore Bristol Research

Analyzing communication flow and process placement in Linda programs on transputers

Author: De-Heer-Menlah Frederick Kofi
Publication venue: Faculty of Science, Computer Science
Publication date: 28/11/2012
Field of study

With the evolution of parallel and distributed systems, users from diverse disciplines have looked to these systems as a solution to their ever increasing needs for computer processing resources. Because parallel processing systems currently require a high level of expertise to program, many researchers are investing effort into developing programming approaches which hide some of the difficulties of parallel programming from users. Linda, is one such parallel paradigm, which is intuitive to use, and which provides a high level decoupling between distributable components of parallel programs. In Linda, efficiency becomes a concern of the implementation rather than of the programmer. There is a substantial overhead in implementing Linda, an inherently shared memory model on a distributed system. This thesis describes the compile-time analysis of tuple space interactions which reduce the run-time matching costs, and permits the distributon of the tuple space data. A language independent module which partitions the tuple space data and suggests appropriate storage schemes for the partitions so as to optimise Linda operations is presented. The thesis also discusses hiding the network topology from the user by automatically allocating Linda processes and tuple space partitons to nodes in the network of transputers. This is done by introducing a fast placement algorithm developed for Linda.KMBT_22

South East Academic Libraries System (SEALS)

Rhodes Repository (SEALS)

Molecular dynamics simulation on a parallel computer.

Author: Grubmüller H.
Heller H.
Schulten K.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/1990
Field of study

For the purpose of molecular dynamics simulations of large biopolymers we have built a parallel computer with a systolic loop architecture, based on Transputers as computational units, and have programmed it in Occam 11. The computational nodes of the computer are linked together in a systolic ring. The program based on this .topology for large biopolymers increases its computational throughput nearly linearly with the number of computational nodes. The program developed is closely related to the simulation programs CHARMM and XPLOR, the input files required (force field, protein structure file, coordinates) and output files generated (sets of atomic coordinates representing dynamic trajectories and energies) are compatible with the corresponding files of these programs. Benchmark results of simulations of biopolymers comprising 66, 568, 3 634, 5 797 and 12 637 atoms are compared with XPLOR simulations on conventional computers (Cray, Convex, Vax). These results demonstrate that the software and hardware developed provide extremely cost effective biopolymer simulations. We present also a simulation (equilibrium of X-ray structure) of the complete photosynthetic reaction center of Rhodopseudomonus viridis (12 637 atoms). The simulation accounts for the Coulomb forces exactly, i.e. no cut-off had been assumed

MPG.PuRe

Application of parallel computation to process simulation for the structured design of IC fabrication processes

Author: Alexander Walter James Cunningham
Publication venue: The University of Edinburgh
Publication date: 01/01/1992
Field of study

Edinburgh Research Archive

Recommended from our members

A strategy for mapping unstructured mesh computational mechanics programs onto distributed memory parallel architectures

Author: McManus Kevin
Publication venue: University of Greenwich,
Publication date: 22/02/1996
Field of study

The motivation of this thesis was to develop strategies that would enable unstructured mesh based computational mechanics codes to exploit the computational advantages offered by distributed memory parallel processors. Strategies that successfully map structured mesh codes onto parallel machines have been developed over the previous decade and used to build a toolkit for automation of the parallelisation process. Extension of the capabilities of this toolkit to include unstructured mesh codes requires new strategies to be developed. This thesis examines the method of parallelisation by geometric domain decomposition using the single program multi data programming paradigm with explicit message passing. This technique involves splitting (decomposing) the problem definition into P parts that may be distributed over P processors in a parallel machine. Each processor runs the same program and operates only on its part of the problem. Messages passed between the processors allow data exchange to maintain consistency with the original algorithm. The strategies developed to parallelise unstructured mesh codes should meet a number of requirements: The algorithms are faithfully reproduced in parallel. The code is largely unaltered in the parallel version. The parallel efficiency is maximised. The techniques should scale to highly parallel systems. The parallelisation process should become automated. Techniques and strategies that meet these requirements are developed and tested in this dissertation using a state of the art integrated computational fluid dynamics and solid mechanics code. The results presented demonstrate the importance of the problem partition in the definition of inter-processor communication and hence parallel performance. The classical measure of partition quality based on the number of cut edges in the mesh partition can be inadequate for real parallel machines. Consideration of the topology of the parallel machine in the mesh partition is demonstrated to be a more significant factor than the number of cut edges in the achieved parallel efficiency. It is shown to be advantageous to allow an increase in the volume of communication in order to achieve an efficient mapping dominated by localised communications. The limitation to parallel performance resulting from communication startup latency is clearly revealed together with strategies to minimise the effect. The generic application of the techniques to other unstructured mesh codes is discussed in the context of automation of the parallelisation process. Automation of parallelisation based on the developed strategies is presented as possible through the use of run time inspector loops to accurately determine the dependencies that define the necessary inter-processor communication

Greenwich Academic Literature Archive

Language Constructs for Data Partitioning and Distribution

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1995
Field of study

Crossref

Statistical approach to performance evaluation of parallel systems with reference to chemical engineering

Author: Skilling Neil
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Edinburgh Research Archive