71 research outputs found

    Four-Dimensional Homogeneous Systolic Pyramid Automata

    Get PDF
    Cellular automaton is famous as a kind of the parallel automaton. Cellular automata were investigated not only in the viewpoint of formal language theory, but also in the viewpoint of pattern recognition. Cellular automata can be classified into some types. A systolic pyramid automata is also one parallel model of various cellular automata. A homogeneous systolic pyramid automaton with four-dimensional layers (4-HSPA) is a pyramid stack of four-dimensional arrays of cells in which the bottom four-dimensional layer (level 0) has size an (a≥1), the next lowest 4(a-1), and so forth, the (a-1)st fourdimensional layer (level (a-1)) consisting of a single cell, called the root. Each cell means an identical finite-state machine. The input is accepted if and only if the root cell ever enters an accepting state. A 4-HSPA is said to be a real-time 4-HSPA if for every four-dimensional tape of size 4a (a≥1), it accepts the fourdimensional tape in time a-1. Moreover, a 1- way fourdimensional cellular automaton (1-4CA) can be considered as a natural extension of the 1-way two-dimensional cellular automaton to four-dimension. The initial configuration is accepted if the last special cell reaches a final state. A 1-4CA is said to be a real- time 1-4CA if when started with fourdimensional array of cells in nonquiescent state, the special cell reaches a final state. In this paper, we proposed a homogeneous systolic automaton with four-dimensional layers (4-HSPA), and investigated some properties of real-time 4-HSPA. Specifically, we first investigated the relationship between the accepting powers of real-time 4-HSPA’s and real-time 1-4CA’s. We next showed the recognizability of four-dimensional connected tapes by real-time 4-HSPA’s

    Fault tolerance issues in nanoelectronics

    Get PDF
    The astonishing success story of microelectronics cannot go on indefinitely. In fact, once devices reach the few-atom scale (nanoelectronics), transient quantum effects are expected to impair their behaviour. Fault tolerant techniques will then be required. The aim of this thesis is to investigate the problem of transient errors in nanoelectronic devices. Transient error rates for a selection of nanoelectronic gates, based upon quantum cellular automata and single electron devices, in which the electrostatic interaction between electrons is used to create Boolean circuits, are estimated. On the bases of such results, various fault tolerant solutions are proposed, for both logic and memory nanochips. As for logic chips, traditional techniques are found to be unsuitable. A new technique, in which the voting approach of triple modular redundancy (TMR) is extended by cascading TMR units composed of nanogate clusters, is proposed and generalised to other voting approaches. For memory chips, an error correcting code approach is found to be suitable. Various codes are considered and a lookup table approach is proposed for encoding and decoding. We are then able to give estimations for the redundancy level to be provided on nanochips, so as to make their mean time between failures acceptable. It is found that, for logic chips, space redundancies up to a few tens are required, if mean times between failures have to be of the order of a few years. Space redundancy can also be traded for time redundancy. As for memory chips, mean times between failures of the order of a few years are found to imply both space and time redundancies of the order of ten

    Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications

    Get PDF
    NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era

    Parallel Parsing of Context-Free Languages on an Array of Processors

    Get PDF
    Kosaraju [Kosaraju 69] and independently ten years later, Guibas, Kung and Thompson [Guibas 79] devised an algorithm (K-GKT) for solving on an array of processors a class of dynamic programming problems of which general context-free language (CFL) recognition is a member. I introduce an extension to K-GKT which allows parsing as well as recognition. The basic idea of the extension is to add counters to the processors. These act as pointers to other processors. The extended algorithm consists of three phases which I call the recognition phase, the marking phase and the parse output phase. I first consider the case of unambiguous grammars. I show that in that case, the algorithm has O(n2log n) space complexity and a linear time complexity. To obtain these results I rely on a counter implementation that allows the execution in constant time of each of the operations: set to zero, test if zero, increment by 1 and decrement by 1. I provide a proof of correctness of this implementation. I introduce the concept of efficient grammars. One factor in the multiplicative constant hidden behind the O(n2log n) space complexity measure for the algorithm is related to the number of non-terminals in the (unambiguous) grammar used. I say that a grammar is k-efficient if it allows the processors to store not more than k pointer pairs. I call a 1-efficient grammar an efficient grammar. I show that two properties that I call nt-disjunction and rhsdasjunction together with unambiguity are sufficient but not necessary conditions for grammar efficiency. I also show that unambiguity itself is not a necessary condition for efficiency. I then consider the case of ambiguous grammars. I present two methods for outputting multiple parses. Both output each parse in linear time. One method has O(n3log n) space complexity while the other has O(n2log n) space complexity. I then address the issue of problem decomposition. I show how part of my extension can be adapted, using a standard technique, to process inputs that would be too large for an array of some fixed size. I then discuss briefly some issues related to implementation. I report on an actual implementation on the I.C.L. DAP. Finally, I show how another systolic CFL parsing algorithm, by Chang, Ibarra and Palis [Chang 87], can be generalized to output parses in preorder and inorder

    Formal process for systolic array design using recurrences

    Get PDF

    Spatial Computing as Intensional Data Parallelism

    Full text link
    International audienceIn this paper, we show that various concepts and tools developed in the 90's in the field of data-parallelism provide a relevant spatial programming framework. It allows high level spatial computation specifications to be translated into efficient low-level operations on processing units. We provide some short examples to illustrate this statement

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

    Get PDF
    Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

    Aspects of multi-resolutional foveal images for robot vision

    Get PDF
    Imperial Users onl
    • …
    corecore