71 research outputs found
Four-Dimensional Homogeneous Systolic Pyramid Automata
Cellular automaton is famous as a kind of the parallel automaton. Cellular automata were investigated not only in the viewpoint of formal language theory, but also in the viewpoint of pattern recognition. Cellular automata can be classified into some types. A systolic pyramid automata is also one parallel model of various cellular automata. A homogeneous systolic pyramid automaton with four-dimensional layers (4-HSPA) is a pyramid stack of four-dimensional arrays of cells in which the bottom four-dimensional layer (level 0) has size an (a≥1), the next lowest 4(a-1), and so forth, the (a-1)st fourdimensional layer (level (a-1)) consisting of a single cell, called the root. Each cell means an identical finite-state machine. The input is accepted if and only if the root cell ever enters an accepting state. A 4-HSPA is said to be a real-time 4-HSPA if for every four-dimensional tape of size 4a (a≥1), it accepts the fourdimensional tape in time a-1. Moreover, a 1- way fourdimensional cellular automaton (1-4CA) can be considered as a natural extension of the 1-way two-dimensional cellular automaton to four-dimension. The initial configuration is accepted if the last special cell reaches a final state. A 1-4CA is said to be a real- time 1-4CA if when started with fourdimensional array of cells in nonquiescent state, the special cell reaches a final state. In this paper, we proposed a homogeneous systolic automaton with four-dimensional layers (4-HSPA), and investigated some properties of real-time 4-HSPA. Specifically, we first investigated the relationship between the accepting powers of real-time 4-HSPA’s and real-time 1-4CA’s. We next showed the recognizability of four-dimensional connected tapes by real-time 4-HSPA’s
Fault tolerance issues in nanoelectronics
The astonishing success story of microelectronics cannot go on indefinitely. In fact, once
devices reach the few-atom scale (nanoelectronics), transient quantum effects are expected
to impair their behaviour. Fault tolerant techniques will then be required. The aim of this
thesis is to investigate the problem of transient errors in nanoelectronic devices. Transient
error rates for a selection of nanoelectronic gates, based upon quantum cellular automata
and single electron devices, in which the electrostatic interaction between electrons is used
to create Boolean circuits, are estimated. On the bases of such results, various fault tolerant
solutions are proposed, for both logic and memory nanochips. As for logic chips, traditional
techniques are found to be unsuitable. A new technique, in which the voting approach of
triple modular redundancy (TMR) is extended by cascading TMR units composed of
nanogate clusters, is proposed and generalised to other voting approaches. For memory
chips, an error correcting code approach is found to be suitable. Various codes are
considered and a lookup table approach is proposed for encoding and decoding. We are
then able to give estimations for the redundancy level to be provided on nanochips, so as to
make their mean time between failures acceptable. It is found that, for logic chips, space
redundancies up to a few tens are required, if mean times between failures have to be of the
order of a few years. Space redundancy can also be traded for time redundancy. As for
memory chips, mean times between failures of the order of a few years are found to imply
both space and time redundancies of the order of ten
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications
NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era
Parallel Parsing of Context-Free Languages on an Array of Processors
Kosaraju [Kosaraju 69] and independently ten years later, Guibas, Kung and
Thompson [Guibas 79] devised an algorithm (K-GKT) for solving on an array of
processors a class of dynamic programming problems of which general context-free
language (CFL) recognition is a member. I introduce an extension to K-GKT
which allows parsing as well as recognition. The basic idea of the extension is to
add counters to the processors. These act as pointers to other processors. The
extended algorithm consists of three phases which I call the recognition phase, the
marking phase and the parse output phase. I first consider the case of unambiguous
grammars. I show that in that case, the algorithm has O(n2log n) space complexity
and a linear time complexity. To obtain these results I rely on a counter implementation
that allows the execution in constant time of each of the operations:
set to zero, test if zero, increment by 1 and decrement by 1. I provide a proof of
correctness of this implementation. I introduce the concept of efficient grammars.
One factor in the multiplicative constant hidden behind the O(n2log n) space complexity
measure for the algorithm is related to the number of non-terminals in the
(unambiguous) grammar used. I say that a grammar is k-efficient if it allows the
processors to store not more than k pointer pairs. I call a 1-efficient grammar an
efficient grammar. I show that two properties that I call nt-disjunction and rhsdasjunction
together with unambiguity are sufficient but not necessary conditions
for grammar efficiency. I also show that unambiguity itself is not a necessary condition
for efficiency. I then consider the case of ambiguous grammars. I present
two methods for outputting multiple parses. Both output each parse in linear time.
One method has O(n3log n) space complexity while the other has O(n2log n) space
complexity. I then address the issue of problem decomposition. I show how part of
my extension can be adapted, using a standard technique, to process inputs that
would be too large for an array of some fixed size. I then discuss briefly some issues
related to implementation. I report on an actual implementation on the I.C.L.
DAP. Finally, I show how another systolic CFL parsing algorithm, by Chang,
Ibarra and Palis [Chang 87], can be generalized to output parses in preorder and
inorder
Spatial Computing as Intensional Data Parallelism
International audienceIn this paper, we show that various concepts and tools developed in the 90's in the field of data-parallelism provide a relevant spatial programming framework. It allows high level spatial computation specifications to be translated into efficient low-level operations on processing units. We provide some short examples to illustrate this statement
Solution of partial differential equations on vector and parallel computers
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems
Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ).
This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)
Aspects of multi-resolutional foveal images for robot vision
Imperial Users onl
- …