Search CORE

230 research outputs found

Shared memory with hidden latency on a family of mesh-like networks

Author: Harris Tim J.
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Constant-Time Algorithms for Minimum Spanning Tree and Related Problems on Processor Array with Reconfigurable Bus Systems

Author: Pan Tien-Tai
Publication venue: Oxford University Press
Publication date
Field of study

[[abstract]]A processor array with a reconfigurable bus system is a parallel computation model that consists of a processor array and a reconfigurable bus system. In this paper, a constant-time algorithm is proposed on this model for finding the cycles in an undirected graph. We can use this algorithm to decide whether a specified edge belongs to the minimum spanning tree of the graph or not. This cycle-finding algorithm is designed on a two-dimensional

n\times n

processor array with a reconfigurable bus system, where

n

is the number of vertices in the graph. Based on this cycle-finding algorithm, the minimum spanning tree problem and the spanning tree problem can be solved in O(1) time by using fewer processors than before, O(

n\times m\times n

) and O(

n^3

) processors respectively. This is a substantial improvement over previous known results. Moreover, we also propose two constant-time algorithms for solving the minimum spanning tree verification problem and spanning tree verification problem by using O(

n^3

) and O(

n^2

) processors, respectively.

National Taiwan Normal University Repository

Aspects of practical implementations of PRAM algorithms

Author: Ravindran Somasundaram
Publication venue
Publication date
Field of study

The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

Warwick Research Archives Portal Repository

Progress Report : 1991 - 1994

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1994
Field of study

MPG.PuRe

Total Eclipse - an Efficient Architectural Realization of the Parallel Random Access Machine

Author: Martti Forsell
Publication venue: 'IntechOpen'
Publication date: 01/01/2010
Field of study

IntechOpen

VTT Research System

Scaling Simulations of Reconfigurable Meshes.

Author: Fernandez zepeda Jose Alberto
Publication venue: LSU Digital Commons
Publication date: 01/01/1999
Field of study

This dissertation deals with reconfigurable bus-based models, a new type of parallel machine that uses dynamically alterable connections between processors to allow efficient communication and to perform fast computations. We focus this work on the Reconfigurable Mesh (R-Mesh), one of the most widely studied reconfigurable models. We study the ability of the R-Mesh to adapt an algorithm instance of an arbitrary size to run on a given smaller model size without significant loss of efficiency. A scaling simulation achieves this adaptation, and the simulation overhead expresses the efficiency of the simulation. We construct a scaling simulation for the Fusing-Restricted Reconfigurable Mesh (FR-Mesh), an important restriction of the R-Mesh. The overhead of this simulation depends only on the simulating machine size and not on the simulated machine size. The results of this scaling simulation extend to a variety of concurrent write rules and also translate to an improved scaling simulation of the R-Mesh itself. We present a bus linearization procedure that transforms an arbitrary non-linear bus configuration of an R-Mesh into an equivalent acyclic linear bus configuration implementable on an Linear Reconfigurable Mesh (LR-Mesh), a weaker version of the R-Mesh. This procedure gives the algorithm designer the liberty of using buses of arbitrary shape, while automatically translating the algorithm to run on a simpler platform. We illustrate our bus linearization method through two important applications. The first leads to a faster scaling simulation of the R-Mesh. The second application adapts algorithms designed for R-Meshes to run on models with pipelined optical buses. We also present a simulation of a Directional Reconfigurable Mesh (DR-Mesh) on an LR-Mesh. This simulation has a much better efficiency compared to previous work. In addition to the LR-Mesh, this simulation also runs on models that use pipelined optical buses

Louisiana State University

A Practical Hierarchial Model of Parallel Computation: The Model

Author: Heywood Todd
Ranka Sanjay
Publication venue: SURFACE at Syracuse University
Publication date: 01/02/1991
Field of study

We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a sub-model, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hierarchical PRAM (H-PRAM) model controls conceptual complexity in the face of asynchrony in two ways. First, by providing the simplifying assumption of synchronization to the design of algorithms, but allowing the algorithms to work asynchronously with each other; and organizing this control asynchrony via an implicit hierarchy relation. Second, by allowing the restriction of communication asynchrony in order to obtain determinate algorithms (thus greatly simplifying proofs of correctness). It is shown that the model is reflective of a variety of existing and proposed parallel architectures, particularly ones that can support massive parallelism. Relationships to programming languages are discussed. Since the PRAM is a sub-model, we can use PRAM algorithms as sub-algorithms in algorithms for the H-PRAM; thus results that have been established with respect to the PRAM are potentially transferable to this new model. The H-PRAM can be used as a flexible tool to investigate general degrees of locality (“neighborhoods of activity) in problems, considering communication and synchronization simultaneously. This gives the potential of obtaining algorithms that map more efficiently to architectures, and of increasing the number of processors that can efficiently be used on a problem (in comparison to a PRAM that charges for communication and synchronization). The model presents a framework in which to study the extent that general locality can be exploited in parallel computing. A companion paper demonstrates the usage of the H-PRAM via the design and analysis of various algorithms for computing the complete binary tree and the FFT/butterfly graph

Syracuse University Research Facility and Collaborative Environment

High-Performance Bus-Based Architectures - Guest Editorial

Author: Lin Rong
Olariu Stephan
Publication venue: ODU Digital Commons
Publication date: 01/01/1999
Field of study

(First paragrapg) This special issue of VLSI Design presents a collection of seven papers selected out of more than 35 submissions received following the Call for Papers. Each submission was sent to three referees, all of them experts in the area of bus-based architectures. The result is impressive. The papers featured in this Special Issue cover a wide range of topics from sorting to string matching, to load balancing, to simulation, matrix operations, to robotics, to the design of high-performance scalable architectures

Directory of Open Access Journals

Old Dominion University