Search CORE

425 research outputs found

Evaluating local indirect addressing in SIMD proc essors

Author: Middleton David
Tomboulian Sherryl
Publication venue
Publication date
Field of study

In the design of parallel computers, there exists a tradeoff between the number and power of individual processors. The single instruction stream, multiple data stream (SIMD) model of parallel computers lies at one extreme of the resulting spectrum. The available hardware resources are devoted to creating the largest possible number of processors, and consequently each individual processor must use the fewest possible resources. Disagreement exists as to whether SIMD processors should be able to generate addresses individually into their local data memory, or all processors should access the same address. The tradeoff is examined between the increased capability and the reduced number of processors that occurs in this single instruction stream, multiple, locally addressed, data (SIMLAD) model. Factors are assembled that affect this design choice, and the SIMLAD model is compared with the bare SIMD and the MIMD models

NASA Technical Reports Server

Memory Access Patterns for Cellular Automata Using GPGPUs

Author: Balasalle James Michael
Publication venue: Digital Commons @ DU
Publication date: 01/01/2011
Field of study

Today\u27s graphical processing units have hundreds of individual processing cores that can be used for general purpose computation of mathematical and scientific problems. Due to their hardware architecture, these devices are especially effective when solving problems that exhibit a high degree of spatial locality. Cellular automata use small, local neighborhoods to determine successive states of individual elements and therefore, provide an excellent opportunity for the application of general purpose GPU computing. However, the GPU presents a challenging environment because it lacks many of the features of traditional CPUs, such as automatic, on-chip caching of data. To fully realize the potential of a GPU, specialized memory techniques and patterns must be employed to account for their unique architecture. Several techniques are presented which not only dramatically improve performance, but, in many cases, also simplify implementation. Many of the approaches discussed relate to the organization of data in memory or patterns for accessing that data, while others detail methods of increasing the computation to memory access ratio. The ideas presented are generic, and applicable to cellular automata models as a whole. Example implementations are given for several problems, including the Game of Life and Gaussian blurring, while performance characteristics, such as instruction and memory accesses counts, are analyzed and compared. A case study is detailed, showing the effectiveness of the various techniques when applied to a larger, real-world problem. Lastly, the reasoning behind each of the improvements is explained, providing general guidelines for determining when a given technique will be most and least effective

University of Denver

Structural Design using Cellular Automata

Author: Gürdal Z.
Slotta D.J.
Tatting B.
Watson L.T.
Publication venue
Publication date: 01/01/2001
Field of study

Traditional parallel methods for structural design do not scale well. This paper discusses the application of massively scalable cellular automata (CA) techniques to structural design. There are two sets of CA rules, one used to propagate stresses and strains, and one to perform design analysis. These rules can be applied serially,periodically,or concurrently, and Jacobi or Gauss- Seidel style updating can be done. These options are compared with respect to convergence,speed, and stability

Computer Science Technical Reports @Virginia Tech

Designing a Novel Reversible Systolic Array Using QCA

Author: Abdollahi Mohammad Mahdi
Tehrani Mohammad
Publication venue: 'Ital Publication'
Publication date: 01/11/2017
Field of study

Many efforts have been done about designing nano-based devices till today. One of these devices is Quantum Cellular Automata (QCA). Because of astonishing growth in VLSI circuits Designs in larger scales and necessity of feature size reduction, there is more need to design complicated control systems using nano-based devices. Besides, since there is a critical manner of temperature in QCA devices, complicated systems using these devices should be designed reversibly. This article has been proposed a novel architecture for QCA circuits in order to utilizing in complicated control systems based on systolic arrays with high throughput and least power dissipation

Emerging Science Journal (ESJ)

Directory of Open Access Journals

A model for Intelligent Random Access Memory architecture (IRAM): cellular automata algorithms on the Associative String Processing machine (ASTRA)

Author: Rohrbach F
Vesztergombi G
Ódor G
Publication venue
Publication date: 11/11/1997
Field of study

In the near future, the computer performance will be completely determined by how long it takes to access memory. There are bottle-necks in memory latency and memory-to processor interface bandwidth. The IRAM initiative could be the answer by putting Processor-In-Memory (PIM). Starting from the massively parallel processing concept, one reached a similar conclusion. The MPPC (Massively Parallel Processing Collaboration) project and the 8K processor ASTRA machine (Associative String Test bench for Research \& Applications) developed at CERN \cite{kuala} can be regarded as a forerunner of the IRAM concept. The computing power of the ASTRA machine, regarded as an IRAM with 64 one-bit processors on a 64

\times

64 bit-matrix memory chip machine, has been demonstrated by running statistical physics algorithms: one-dimensional stochastic cellular automata, as a simple model for dynamical phase transitions. As a relevant result for physics, the damage spreading of this model has been investigated

CERN Document Server

Acta Cybernetica : Tomus 6. Fasciculus 4.

Author
Publication venue
Publication date: 01/01/1984
Field of study

University of Szeged

Massively Parallel Associative String Processor (ASP) for High Energy Physics

Author: Vesztergombi G
Publication venue: CERN
Publication date: 01/01/1995
Field of study

CERN Document Server

Fault tolerance issues in nanoelectronics

Author: Spagocci S.
Publication venue: UCL (University College London)
Publication date: 01/11/2008
Field of study

The astonishing success story of microelectronics cannot go on indefinitely. In fact, once devices reach the few-atom scale (nanoelectronics), transient quantum effects are expected to impair their behaviour. Fault tolerant techniques will then be required. The aim of this thesis is to investigate the problem of transient errors in nanoelectronic devices. Transient error rates for a selection of nanoelectronic gates, based upon quantum cellular automata and single electron devices, in which the electrostatic interaction between electrons is used to create Boolean circuits, are estimated. On the bases of such results, various fault tolerant solutions are proposed, for both logic and memory nanochips. As for logic chips, traditional techniques are found to be unsuitable. A new technique, in which the voting approach of triple modular redundancy (TMR) is extended by cascading TMR units composed of nanogate clusters, is proposed and generalised to other voting approaches. For memory chips, an error correcting code approach is found to be suitable. Various codes are considered and a lookup table approach is proposed for encoding and decoding. We are then able to give estimations for the redundancy level to be provided on nanochips, so as to make their mean time between failures acceptable. It is found that, for logic chips, space redundancies up to a few tens are required, if mean times between failures have to be of the order of a few years. Space redundancy can also be traded for time redundancy. As for memory chips, mean times between failures of the order of a few years are found to imply both space and time redundancies of the order of ten

UCL Discovery

A Search for Good Pseudo-random Number Generators : Survey and Empirical Studies

Author: Bhattacharjee Kamalika
Das Sukanta
Maity Krishnendu
Publication venue
Publication date: 03/11/2018
Field of study

In today's world, several applications demand numbers which appear random but are generated by a background algorithm; that is, pseudo-random numbers. Since late

19^{th}

century, researchers have been working on pseudo-random number generators (PRNGs). Several PRNGs continue to develop, each one demanding to be better than the previous ones. In this scenario, this paper targets to verify the claim of so-called good generators and rank the existing generators based on strong empirical tests in same platforms. To do this, the genre of PRNGs developed so far has been explored and classified into three groups -- linear congruential generator based, linear feedback shift register based and cellular automata based. From each group, well-known generators have been chosen for empirical testing. Two types of empirical testing has been done on each PRNG -- blind statistical tests with Diehard battery of tests, TestU01 library and NIST statistical test-suite and graphical tests (lattice test and space-time diagram test). Finally, the selected

29

PRNGs are divided into

24

groups and are ranked according to their overall performance in all empirical tests

arXiv.org e-Print Archive