151 research outputs found

    A Practical Hierarchial Model of Parallel Computation: The Model

    Get PDF
    We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a sub-model, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hierarchical PRAM (H-PRAM) model controls conceptual complexity in the face of asynchrony in two ways. First, by providing the simplifying assumption of synchronization to the design of algorithms, but allowing the algorithms to work asynchronously with each other; and organizing this control asynchrony via an implicit hierarchy relation. Second, by allowing the restriction of communication asynchrony in order to obtain determinate algorithms (thus greatly simplifying proofs of correctness). It is shown that the model is reflective of a variety of existing and proposed parallel architectures, particularly ones that can support massive parallelism. Relationships to programming languages are discussed. Since the PRAM is a sub-model, we can use PRAM algorithms as sub-algorithms in algorithms for the H-PRAM; thus results that have been established with respect to the PRAM are potentially transferable to this new model. The H-PRAM can be used as a flexible tool to investigate general degrees of locality (“neighborhoods of activity) in problems, considering communication and synchronization simultaneously. This gives the potential of obtaining algorithms that map more efficiently to architectures, and of increasing the number of processors that can efficiently be used on a problem (in comparison to a PRAM that charges for communication and synchronization). The model presents a framework in which to study the extent that general locality can be exploited in parallel computing. A companion paper demonstrates the usage of the H-PRAM via the design and analysis of various algorithms for computing the complete binary tree and the FFT/butterfly graph

    Shared memory with hidden latency on a family of mesh-like networks

    Get PDF

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Efficient Execution of Sequential Instructions Streams by Physical Machines

    Get PDF
    Any computational model which relies on a physical system is likely to be subject to the fact that information density and speed have intrinsic, ultimate limits. The RAM model, and in particular the underlying assumption that memory accesses can be carried out in time independent from memory size itself, is not physically implementable. This work has developed in the field of limiting technology machines, in which it is somewhat provocatively assumed that technology has achieved the physical limits. The ultimate goal for this is to tackle the problem of the intrinsic latencies of physical systems by encouraging scalable organizations for processors and memories. An algorithmic study is presented, which depicts the implementation of high concurrency programs for SP and SPE, sequential machine models able to compute direct-flow programs in optimal time. Then, a novel pieplined, hierarchical memory organization is presented, with optimal latency and bandwidth for a physical system. In order to both take full advantage of the memory capabilities and exploit the available instruction level parallelism of the code to be executed, a novel processor model is developed. Particular care is put in devising an efficient information flow within the processor itself. Both designs are extremely scalable, as they are based on fixed capacity and fixed size nodes, which are connected as a multidimensional array. Performance analysis on the resulting machine design has led to the discovery that latencies internal to the processor can be the dominating source of complexity in instruction flow execution, which adds to the effects of processor-memory interaction. A characterization of instruction flows is then developed, which is based on the topology induced by instruction dependences

    Algorithmic Motion Planning and Related Geometric Problems on Parallel Machines (Dissertation Proposal)

    Get PDF
    The problem of algorithmic motion planning is one that has received considerable attention in recent years. The automatic planning of motion for a mobile object moving amongst obstacles is a fundamentally important problem with numerous applications in computer graphics and robotics. Numerous approximate techniques (AI-based, heuristics-based, potential field methods, for example) for motion planning have long been in existence, and have resulted in the design of experimental systems that work reasonably well under various special conditions [7, 29, 30]. Our interest in this problem, however, is in the use of algorithmic techniques for motion planning, with provable worst case performance guarantees. The study of algorithmic motion planning has been spurred by recent research that has established the mathematical depth of motion planning. Classical geometry, algebra, algebraic geometry and combinatorics are some of the fields of mathematics that have been used to prove various results that have provided better insight into the issues involved in motion planning [49]. In particular, the design and analysis of geometric algorithms has proved to be very useful for numerous important special cases. In the remainder of this proposal we will substitute the more precise term of algorithmic motion planning by just motion planning

    Fast Parallel Algorithms for Basic Problems

    Get PDF
    Parallel processing is one of the most active research areas these days. We are interested in one aspect of parallel processing, i.e. the design and analysis of parallel algorithms. Here, we focus on non-numerical parallel algorithms for basic combinatorial problems, such as data structures, selection, searching, merging and sorting. The purposes of studying these types of problems are to obtain basic building blocks which will be useful in solving complex problems, and to develop fundamental algorithmic techniques. In this thesis, we study the following problems: priority queues, multiple search and multiple selection, and reconstruction of a binary tree from its traversals. The research on priority queue was motivated by its various applications. The purpose of studying multiple search and multiple selection is to explore the relationships between four of the most fundamental problems in algorithm design, that is, selection, searching, merging and sorting; while our parallel solutions can be used as subroutines in algorithms for other problems. The research on the last problem, reconstruction of a binary tree from its traversals, was stimulated by a challenge proposed in a recent paper by Berkman et al. ( Highly Parallelizable Problems, STOC 89) to design doubly logarithmic time optimal parallel algorithms because a remarkably small number of such parallel algorithms exist

    A 3D measurement and computerized meshing study to promote bus ridership among people using powered mobility aids

    Get PDF
    People who use powered mobility aids such as wheelchairs and scooters need and want to use public transport. Buses are the most affordable and efficient form of public transport, capable of connecting people across local communities. However, with curbside rather than platform boarding and internal space limitations, buses also present many accessibility challenges for people using mobility aids during ingress, egress, and interior maneuverability. In Australia, people using mobility aids board low floor buses that are required to comply with the national bus accessibility standard, using the front doors. A new standard was recently created to provide a Blue Label identification for powered mobility aids suitable to access public transport. The accuracy of this standard to identify mobility aids suitable to use on buses has not been verified. This research used a world-first methodology that included 3-Dimensional (3D) scanning of 35 mobility aids and 21 buses. The resulting 735 scan combinations were efficiently meshed using Meshlab, an open-source software. The research demonstrated that (i) although none of the buses were compliant with the relevant standard in 3D, many could still facilitate the boarding of a variety of mobility aids, and (ii) the Blue Label, while a valuable guide, did not accurately identifying all mobility aids that would and would not be able to board buses. This research has shortlisted nine mobility aids that can be recommended to consumers as being able to fit all the full-size buses tested. The dimensions of mobility aids that appear to enable access on most buses were also identified for consumers to consider when purchasing a mobility aid. The novel 3D meshing methodology used in this research also revealed that most collision points between mobility aids and buses occur in the curved-corridor entry of the buses. To minimize this entry problem, future bus boarding designs should consider the option of double-door entry/exit in the middle of the bus, which is common in many other countries. Adoption of this strategy would mitigate some of the challenges that people using mobility aids encounter when accessing buses, thereby increasing public transport ridership among this group. © Copyright © 2020 Unsworth, Chua and Gudimetla
    • …
    corecore