1,327 research outputs found
Cellular Automata Can Reduce Memory Requirements of Collective-State Computing
Various non-classical approaches of distributed information processing, such
as neural networks, computation with Ising models, reservoir computing, vector
symbolic architectures, and others, employ the principle of collective-state
computing. In this type of computing, the variables relevant in a computation
are superimposed into a single high-dimensional state vector, the
collective-state. The variable encoding uses a fixed set of random patterns,
which has to be stored and kept available during the computation. Here we show
that an elementary cellular automaton with rule 90 (CA90) enables space-time
tradeoff for collective-state computing models that use random dense binary
representations, i.e., memory requirements can be traded off with computation
running CA90. We investigate the randomization behavior of CA90, in particular,
the relation between the length of the randomization period and the size of the
grid, and how CA90 preserves similarity in the presence of the initialization
noise. Based on these analyses we discuss how to optimize a collective-state
computing model, in which CA90 expands representations on the fly from short
seed patterns - rather than storing the full set of random patterns. The CA90
expansion is applied and tested in concrete scenarios using reservoir computing
and vector symbolic architectures. Our experimental results show that
collective-state computing with CA90 expansion performs similarly compared to
traditional collective-state models, in which random patterns are generated
initially by a pseudo-random number generator and then stored in a large
memory.Comment: 13 pages, 11 figure
Evaluating local indirect addressing in SIMD proc essors
In the design of parallel computers, there exists a tradeoff between the number and power of individual processors. The single instruction stream, multiple data stream (SIMD) model of parallel computers lies at one extreme of the resulting spectrum. The available hardware resources are devoted to creating the largest possible number of processors, and consequently each individual processor must use the fewest possible resources. Disagreement exists as to whether SIMD processors should be able to generate addresses individually into their local data memory, or all processors should access the same address. The tradeoff is examined between the increased capability and the reduced number of processors that occurs in this single instruction stream, multiple, locally addressed, data (SIMLAD) model. Factors are assembled that affect this design choice, and the SIMLAD model is compared with the bare SIMD and the MIMD models
Hardware optimizations of dense binary hyperdimensional computing: Rematerialization of hypervectors, binarized bundling, and combinational associative memory
Brain-inspired hyperdimensional (HD) computing models neural activity patterns of the very size of the brain's circuits with points of a hyperdimensional space, that is, with hypervectors. Hypervectors are Ddimensional (pseudo)random vectors with independent and identically distributed (i.i.d.) components constituting ultra-wide holographic words: D = 10,000 bits, for instance. At its very core, HD computing manipulates a set of seed hypervectors to build composite hypervectors representing objects of interest. It demands memory optimizations with simple operations for an efficient hardware realization. In this article, we propose hardware techniques for optimizations of HD computing, in a synthesizable open-source VHDL library, to enable co-located implementation of both learning and classification tasks on only a small portion of Xilinx UltraScale FPGAs: (1)We propose simple logical operations to rematerialize the hypervectors on the fly rather than loading them from memory. These operations massively reduce the memory footprint by directly computing the composite hypervectors whose individual seed hypervectors do not need to be stored in memory. (2) Bundling a series of hypervectors over time requires a multibit counter per every hypervector component. We instead propose a binarized back-to-back bundling without requiring any counters. This truly enables onchip learning with minimal resources as every hypervector component remains binary over the course of training to avoid otherwise multibit components. (3) For every classification event, an associative memory is in charge of finding the closest match between a set of learned hypervectors and a query hypervector by using a distance metric. This operator is proportional to hypervector dimension (D), and hence may take O(D) cycles per classification event. Accordingly, we significantly improve the throughput of classification by proposing associative memories that steadily reduce the latency of classification to the extreme of a single cycle. (4) We perform a design space exploration incorporating the proposed techniques on FPGAs for a wearable biosignal processing application as a case study. Our techniques achieve up to 2.39
7 area saving, or 2,337
7 throughput improvement. The Pareto optimal HD architecture is mapped on only 18,340 configurable logic blocks (CLBs) to learn and classify five hand gestures using four electromyography sensors
Generalizations of the Hamming Associative Memory
This Letter reviews four models of associative memory which generalize the operation of the Hamming associative memory: the grounded Hamming memory, the cellular Hamming memory, the decoupled Hamming memory, and the two-level decoupled Hamming memory. These memory models offer high performance and allow for a more practical hardware realization than the Hamming net and other fully interconnected neural net architectures.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45397/1/11063_2004_Article_319505.pd
Recommended from our members
Considerations in designing a cybernetic simple 'learning' model; and an overview of the problem of modelling learning
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Learning is viewed as a central feature of living systems and must be manifested in any artifact that claims to exhibit general intelligence. The central aims of the thesis are twofold: (1) - To review and critically assess the empirical and theoretical aspects of learning as have been addressed in a multitude of disciplines, with the aim of extracting fundamental features and elements. (2) - To develop a more systematic approach to the cybernetic modelling of learning than has been achieved hitherto. In pursuit of aim (1) above the following discussions are included: Historical and Philosophical backgrounds; Natural learning, both physiological and psychological aspects; Hierarchies of learning identified in the evolutionary, functional and developmental senses; An extensive section on the general problem of modelling of learning and the formal tools, is included as a link between aims (1) and (2). Following this a systematic and historically oriented study of cybernetic and other related approaches to the problem of modelling of learning is presented. This then leads to the development of a state-of-the-art general purpose experimental cybernetic learning model. The programming and use of this model is also fully described, including an elaborate scheme for the manifestation of simple learning
- …