30 research outputs found
Integer Echo State Networks: Hyperdimensional Reservoir Computing
We propose an approximation of Echo State Networks (ESN) that can be
efficiently implemented on digital hardware based on the mathematics of
hyperdimensional computing. The reservoir of the proposed Integer Echo State
Network (intESN) is a vector containing only n-bits integers (where n<8 is
normally sufficient for a satisfactory performance). The recurrent matrix
multiplication is replaced with an efficient cyclic shift operation. The intESN
architecture is verified with typical tasks in reservoir computing: memorizing
of a sequence of inputs; classifying time-series; learning dynamic processes.
Such an architecture results in dramatic improvements in memory footprint and
computational efficiency, with minimal performance loss.Comment: 10 pages, 10 figures, 1 tabl
Connectionist-Symbolic Machine Intelligence using Cellular Automata based Reservoir-Hyperdimensional Computing
We introduce a novel framework of reservoir computing, that is capable of
both connectionist machine intelligence and symbolic computation. Cellular
automaton is used as the reservoir of dynamical systems. Input is randomly
projected onto the initial conditions of automaton cells and nonlinear
computation is performed on the input via application of a rule in the
automaton for a period of time. The evolution of the automaton creates a
space-time volume of the automaton state space, and it is used as the
reservoir. The proposed framework is capable of long short-term memory and it
requires orders of magnitude less computation compared to Echo State Networks.
We prove that cellular automaton reservoir holds a distributed representation
of attribute statistics, which provides a more effective computation than local
representation. It is possible to estimate the kernel for linear cellular
automata via metric learning, that enables a much more efficient distance
computation in support vector machine framework. Also, binary reservoir feature
vectors can be combined using Boolean operations as in hyperdimensional
computing, paving a direct way for concept building and symbolic processing.Comment: Corrected Typos. Responded some comments on section 8. Added appendix
for details. Recurrent architecture emphasize
Cellular Automata Can Reduce Memory Requirements of Collective-State Computing
Various non-classical approaches of distributed information processing, such
as neural networks, computation with Ising models, reservoir computing, vector
symbolic architectures, and others, employ the principle of collective-state
computing. In this type of computing, the variables relevant in a computation
are superimposed into a single high-dimensional state vector, the
collective-state. The variable encoding uses a fixed set of random patterns,
which has to be stored and kept available during the computation. Here we show
that an elementary cellular automaton with rule 90 (CA90) enables space-time
tradeoff for collective-state computing models that use random dense binary
representations, i.e., memory requirements can be traded off with computation
running CA90. We investigate the randomization behavior of CA90, in particular,
the relation between the length of the randomization period and the size of the
grid, and how CA90 preserves similarity in the presence of the initialization
noise. Based on these analyses we discuss how to optimize a collective-state
computing model, in which CA90 expands representations on the fly from short
seed patterns - rather than storing the full set of random patterns. The CA90
expansion is applied and tested in concrete scenarios using reservoir computing
and vector symbolic architectures. Our experimental results show that
collective-state computing with CA90 expansion performs similarly compared to
traditional collective-state models, in which random patterns are generated
initially by a pseudo-random number generator and then stored in a large
memory.Comment: 13 pages, 11 figure
The Hyperdimensional Transform for Distributional Modelling, Regression and Classification
Hyperdimensional computing (HDC) is an increasingly popular computing
paradigm with immense potential for future intelligent applications. Although
the main ideas already took form in the 1990s, HDC recently gained significant
attention, especially in the field of machine learning and data science. Next
to efficiency, interoperability and explainability, HDC offers attractive
properties for generalization as it can be seen as an attempt to combine
connectionist ideas from neural networks with symbolic aspects. In recent work,
we introduced the hyperdimensional transform, revealing deep theoretical
foundations for representing functions and distributions as high-dimensional
holographic vectors. Here, we present the power of the hyperdimensional
transform to a broad data science audience. We use the hyperdimensional
transform as a theoretical basis and provide insight into state-of-the-art HDC
approaches for machine learning. We show how existing algorithms can be
modified and how this transform can lead to a novel, well-founded toolbox. Next
to the standard regression and classification tasks of machine learning, our
discussion includes various aspects of statistical modelling, such as
representation, learning and deconvolving distributions, sampling, Bayesian
inference, and uncertainty estimation
Hardware optimizations of dense binary hyperdimensional computing: Rematerialization of hypervectors, binarized bundling, and combinational associative memory
Brain-inspired hyperdimensional (HD) computing models neural activity patterns of the very size of the brain's circuits with points of a hyperdimensional space, that is, with hypervectors. Hypervectors are Ddimensional (pseudo)random vectors with independent and identically distributed (i.i.d.) components constituting ultra-wide holographic words: D = 10,000 bits, for instance. At its very core, HD computing manipulates a set of seed hypervectors to build composite hypervectors representing objects of interest. It demands memory optimizations with simple operations for an efficient hardware realization. In this article, we propose hardware techniques for optimizations of HD computing, in a synthesizable open-source VHDL library, to enable co-located implementation of both learning and classification tasks on only a small portion of Xilinx UltraScale FPGAs: (1)We propose simple logical operations to rematerialize the hypervectors on the fly rather than loading them from memory. These operations massively reduce the memory footprint by directly computing the composite hypervectors whose individual seed hypervectors do not need to be stored in memory. (2) Bundling a series of hypervectors over time requires a multibit counter per every hypervector component. We instead propose a binarized back-to-back bundling without requiring any counters. This truly enables onchip learning with minimal resources as every hypervector component remains binary over the course of training to avoid otherwise multibit components. (3) For every classification event, an associative memory is in charge of finding the closest match between a set of learned hypervectors and a query hypervector by using a distance metric. This operator is proportional to hypervector dimension (D), and hence may take O(D) cycles per classification event. Accordingly, we significantly improve the throughput of classification by proposing associative memories that steadily reduce the latency of classification to the extreme of a single cycle. (4) We perform a design space exploration incorporating the proposed techniques on FPGAs for a wearable biosignal processing application as a case study. Our techniques achieve up to 2.39
7 area saving, or 2,337
7 throughput improvement. The Pareto optimal HD architecture is mapped on only 18,340 configurable logic blocks (CLBs) to learn and classify five hand gestures using four electromyography sensors
Perceptron theory can predict the accuracy of neural networks
Multilayer neural networks set the current state of
the art for many technical classification problems. But, these
networks are still, essentially, black boxes in terms of analyzing
them and predicting their performance. Here, we develop a
statistical theory for the one-layer perceptron and show that
it can predict performances of a surprisingly large variety of
neural networks with different architectures. A general theory
of classification with perceptrons is developed by generalizing
an existing theory for analyzing reservoir computing models
and connectionist models for symbolic reasoning known as
vector symbolic architectures. Our statistical theory offers three
formulas leveraging the signal statistics with increasing detail.
The formulas are analytically intractable, but can be evaluated
numerically. The description level that captures maximum details
requires stochastic sampling methods. Depending on the network
model, the simpler formulas already yield high prediction accuracy.
The quality of the theory predictions is assessed in three
experimental settings, a memorization task for echo state networks
(ESNs) from reservoir computing literature, a collection of
classification datasets for shallow randomly connected networks,
and the ImageNet dataset for deep convolutional neural networks.
We find that the second description level of the perceptron theory
can predict the performance of types of ESNs, which could not
be described previously. Furthermore, the theory can predict
deep multilayer neural networks by being applied to their output
layer. While other methods for prediction of neural networks
performance commonly require to train an estimator model,
the proposed theory requires only the first two moments of
the distribution of the postsynaptic sums in the output neurons.
Moreover, the perceptron theory compares favorably to other
methods that do not rely on training an estimator model