797 research outputs found

    Predicting Non-linear Cellular Automata Quickly by Decomposing Them into Linear Ones

    Full text link
    We show that a wide variety of non-linear cellular automata (CAs) can be decomposed into a quasidirect product of linear ones. These CAs can be predicted by parallel circuits of depth O(log^2 t) using gates with binary inputs, or O(log t) depth if ``sum mod p'' gates with an unbounded number of inputs are allowed. Thus these CAs can be predicted by (idealized) parallel computers much faster than by explicit simulation, even though they are non-linear. This class includes any CA whose rule, when written as an algebra, is a solvable group. We also show that CAs based on nilpotent groups can be predicted in depth O(log t) or O(1) by circuits with binary or ``sum mod p'' gates respectively. We use these techniques to give an efficient algorithm for a CA rule which, like elementary CA rule 18, has diffusing defects that annihilate in pairs. This can be used to predict the motion of defects in rule 18 in O(log^2 t) parallel time

    Extensible sparse functional arrays with circuit parallelism

    Get PDF
    A longstanding open question in algorithms and data structures is the time and space complexity of pure functional arrays. Imperative arrays provide update and lookup operations that require constant time in the RAM theoretical model, but it is conjectured that there does not exist a RAM algorithm that achieves the same complexity for functional arrays, unless restrictions are placed on the operations. The main result of this paper is an algorithm that does achieve optimal unit time and space complexity for update and lookup on functional arrays. This algorithm does not run on a RAM, but instead it exploits the massive parallelism inherent in digital circuits. The algorithm also provides unit time operations that support storage management, as well as sparse and extensible arrays. The main idea behind the algorithm is to replace a RAM memory by a tree circuit that is more powerful than the RAM yet has the same asymptotic complexity in time (gate delays) and size (number of components). The algorithm uses an array representation that allows elements to be shared between many arrays with only a small constant factor penalty in space and time. This system exemplifies circuit parallelism, which exploits very large numbers of transistors per chip in order to speed up key algorithms. Extensible Sparse Functional Arrays (ESFA) can be used with both functional and imperative programming languages. The system comprises a set of algorithms and a circuit specification, and it has been implemented on a GPGPU with good performance
    • …
    corecore