Search CORE

5 research outputs found

Parallel RAM from Cyclic Circuits

Author: Heath David
Publication venue
Publication date: 10/09/2023
Field of study

Known simulations of random access machines (RAMs) or parallel RAMs (PRAMs) by Boolean circuits incur significant polynomial blowup, due to the need to repeatedly simulate accesses to a large main memory. Consider two modifications to Boolean circuits: (1) remove the restriction that circuit graphs are acyclic and (2) enhance AND gates such that they output zero eagerly. If an AND gate has a zero input, it 'short circuits' and outputs zero without waiting for its second input. We call this the cyclic circuit model. Note, circuits in this model remain combinational, as they do not allow wire values to change over time. We simulate a bounded-word-size PRAM via a cyclic circuit, and the blowup from the simulation is only polylogarithmic. Consider a PRAM program

P

that on a length

n

input uses an arbitrary number of processors to manipulate words of size

\Theta(\log n)

bits and then halts within

W(n)

work. We construct a size-

O(W(n)\cdot \log^4 n)

cyclic circuit that simulates

P

. Suppose that on a particular input,

P

halts in time

T

; our circuit computes the same output within

T \cdot O(\log^3 n)

gate delay. This implies theoretical feasibility of powerful parallel machines. Cyclic circuits can be implemented in hardware, and our circuit achieves performance within polylog factors of PRAM. Our simulated PRAM synchronizes processors by simply leveraging logical dependencies between wires

arXiv.org e-Print Archive

Parallel routing algorithms in Benes-Clos networks.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1996
Field of study

by Soung-Yue Liew.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 55-57).Chapter 1 --- Introduction --- p.1Chapter 2 --- The Basic Principles of Routing Algorithms --- p.10Chapter 2.1 --- The principles of sequential algorithms --- p.11Chapter 2.1.1 --- Edge-coloring of bipartite graph with maximum degree two --- p.11Chapter 2.1.2 --- Edge-coloring of bipartite graph with maximum degree M --- p.14Chapter 2.2 --- Looping algorithm --- p.17Chapter 2.2.1 --- Paull's Matrix --- p.17Chapter 2.2.2 --- Chain to be rearranged in Paull's Matrix --- p.18Chapter 2.3 --- The principles of parallel algorithms --- p.19Chapter 2.3.1 --- Edge-coloring of bipartite graph with maximum degree two --- p.20Chapter 2.3.2 --- Edge-coloring of bipartite graph with maximum degree 2m --- p.22Chapter 3 --- Parallel routing algorithm in Benes-Clos networks --- p.25Chapter 3.1 --- Routing properties of Benes networks --- p.25Chapter 3.1.1 --- Three-stage structure and routing constraints --- p.26Chapter 3.1.2 --- Algebraic interpretation of connection set up problem --- p.29Chapter 3.1.3 --- Equivalent classes --- p.31Chapter 3.2 --- Parallel routing algorithm --- p.32Chapter 3.2.1 --- Basic principles --- p.32Chapter 3.2.2 --- Initialization --- p.34Chapter 3.2.3 --- Algorithm --- p.36Chapter 3.2.4 --- Set up the states and determine π for next stage --- p.37Chapter 3.2.5 --- Simulation results --- p.40Chapter 3.2.6 --- Time complexity --- p.41Chapter 3.3 --- Contention resolution --- p.41Chapter 3.4 --- Algorithms applied to Clos network with 2m central switches --- p.43Chapter 3.5 --- Parallel algorithms in rearrangeability --- p.47Chapter 4 --- Conclusions --- p.5

CUHK Digital Repository

A self-routing permutation network

Author: Koppelman David M.
Yavuz Oruç A.
Publication venue: LSU Digital Commons
Publication date: 01/12/1989
Field of study

A self-routing permutation network is a connector which can set its own switches to realize any one-to-one mapping of its inputs onto its outputs. Many permutation networks have been reported in the literature, but none with the self-routing property, except crossbars and cellular permutation arrays which have excessive cost. This paper describes a self-routing permutation network which has O(log3n) bit-level delay and uses O(n log3n) bit-level hardware, where n is the number of inputs to the network. The network is derived from a complementary Beneš network by replacing each of its two switches in its first stage by what is called a 1-sorter and recursively defining the switches in the third stage as self-routing networks. The use of 1-sorters results in substantial reduction in both propagation delay and hardware cost when contrasted with O(n) delay and O(n1.59) hardware of the recursively decomposed version of a complementary Beneš network. Furthermore, these complexities match the propagation delay and hardware cost of Batcher\u27s sorters (the only networks, other than crossbars and cellular permutation arrays, which are known to behave like self-routing permutation networks). More specifically, it is shown that the network of this paper uses about half of the hardware with about four-thirds of the delay of a Batcher\u27s sorter. © 1990

Louisiana State University