Parallel RAM from Cyclic Circuits

Abstract

Known simulations of random access machines (RAMs) or parallel RAMs (PRAMs) by Boolean circuits incur significant polynomial blowup, due to the need to repeatedly simulate accesses to a large main memory. Consider two modifications to Boolean circuits: (1) remove the restriction that circuit graphs are acyclic and (2) enhance AND gates such that they output zero eagerly. If an AND gate has a zero input, it 'short circuits' and outputs zero without waiting for its second input. We call this the cyclic circuit model. Note, circuits in this model remain combinational, as they do not allow wire values to change over time. We simulate a bounded-word-size PRAM via a cyclic circuit, and the blowup from the simulation is only polylogarithmic. Consider a PRAM program PP that on a length nn input uses an arbitrary number of processors to manipulate words of size Θ(log⁑n)\Theta(\log n) bits and then halts within W(n)W(n) work. We construct a size-O(W(n)β‹…log⁑4n)O(W(n)\cdot \log^4 n) cyclic circuit that simulates PP. Suppose that on a particular input, PP halts in time TT; our circuit computes the same output within Tβ‹…O(log⁑3n)T \cdot O(\log^3 n) gate delay. This implies theoretical feasibility of powerful parallel machines. Cyclic circuits can be implemented in hardware, and our circuit achieves performance within polylog factors of PRAM. Our simulated PRAM synchronizes processors by simply leveraging logical dependencies between wires

    Similar works

    Full text

    thumbnail-image

    Available Versions