21 research outputs found
PULP-HD: Accelerating Brain-Inspired High-Dimensional Computing on a Parallel Ultra-Low Power Platform
Computing with high-dimensional (HD) vectors, also referred to as
, is a brain-inspired alternative to computing with
scalars. Key properties of HD computing include a well-defined set of
arithmetic operations on hypervectors, generality, scalability, robustness,
fast learning, and ubiquitous parallel operations. HD computing is about
manipulating and comparing large patterns-binary hypervectors with 10,000
dimensions-making its efficient realization on minimalistic ultra-low-power
platforms challenging. This paper describes HD computing's acceleration and its
optimization of memory accesses and operations on a silicon prototype of the
PULPv3 4-core platform (1.5mm, 2mW), surpassing the state-of-the-art
classification accuracy (on average 92.4%) with simultaneous 3.7
end-to-end speed-up and 2 energy saving compared to its single-core
execution. We further explore the scalability of our accelerator by increasing
the number of inputs and classification window on a new generation of the PULP
architecture featuring bit-manipulation instruction extensions and larger
number of 8 cores. These together enable a near ideal speed-up of 18.4
compared to the single-core PULPv3
An EMG Gesture Recognition System with Flexible High-Density Sensors and Brain-Inspired High-Dimensional Classifier
EMG-based gesture recognition shows promise for human-machine interaction.
Systems are often afflicted by signal and electrode variability which degrades
performance over time. We present an end-to-end system combating this
variability using a large-area, high-density sensor array and a robust
classification algorithm. EMG electrodes are fabricated on a flexible substrate
and interfaced to a custom wireless device for 64-channel signal acquisition
and streaming. We use brain-inspired high-dimensional (HD) computing for
processing EMG features in one-shot learning. The HD algorithm is tolerant to
noise and electrode misplacement and can quickly learn from few gestures
without gradient descent or back-propagation. We achieve an average
classification accuracy of 96.64% for five gestures, with only 7% degradation
when training and testing across different days. Our system maintains this
accuracy when trained with only three trials of gestures; it also demonstrates
comparable accuracy with the state-of-the-art when trained with one trial
One-shot Learning for iEEG Seizure Detection Using End-to-end Binary Operations: Local Binary Patterns with Hyperdimensional Computing
This paper presents an efficient binarized algorithm for both learning and
classification of human epileptic seizures from intracranial
electroencephalography (iEEG). The algorithm combines local binary patterns
with brain-inspired hyperdimensional computing to enable end-to-end learning
and inference with binary operations. The algorithm first transforms iEEG time
series from each electrode into local binary pattern codes. Then atomic
high-dimensional binary vectors are used to construct composite representations
of seizures across all electrodes. For the majority of our patients (10 out of
16), the algorithm quickly learns from one or two seizures (i.e., one-/few-shot
learning) and perfectly generalizes on 27 further seizures. For other patients,
the algorithm requires three to six seizures for learning. Overall, our
algorithm surpasses the state-of-the-art methods for detecting 65 novel
seizures with higher specificity and sensitivity, and lower memory footprint.Comment: Published as a conference paper at the IEEE BioCAS 201
Laelaps: An Energy-Efficient Seizure Detection Algorithm from Long-term Human iEEG Recordings without False Alarms
We propose Laelaps, an energy-efficient and fast learning algorithm with no false alarms for epileptic seizure detection from long-term intracranial electroencephalography (iEEG) signals. Laelaps uses end-to-end binary operations by exploiting symbolic dynamics and brain-inspired hyperdimensional computing. Laelaps's results surpass those yielded by state-of-the-art (SoA) methods [1], [2], [3], including deep learning, on a new very large dataset containing 116 seizures of 18 drug-resistant epilepsy patients in 2656 hours of recordings - each patient implanted with 24 to 128 iEEG electrodes. Laelaps trains 18 patient-specific models by using only 24 seizures: 12 models are trained with one seizure per patient, the others with two seizures. The trained models detect 79 out of 92 unseen seizures without any false alarms across all the patients as a big step forward in practical seizure detection. Importantly, a simple implementation of Laelaps on the Nvidia Tegra X2 embedded device achieves 1.7
7-3.9
7 faster execution and 1.4
7-2.9
7 lower energy consumption compared to the best result from the SoA methods. Our source code and anonymized iEEG dataset are freely available at http://ieeg-swez.ethz.ch
Hyperdimensional Computing-based Multimodality Emotion Recognition with Physiological Signals
To interact naturally and achieve mutual sympathy between humans and machines, emotion recognition is one of the most important function to realize advanced human-computer interaction devices. Due to the high correlation between emotion and involuntary physiological changes, physiological signals are a prime candidate for emotion analysis. However, due to the need of a huge amount of training data for a high-quality machine learning model, computational complexity becomes a major bottleneck. To overcome this issue, brain-inspired hyperdimensional (HD) computing, an energy-efficient and fast learning computational paradigm, has a high potential to achieve a balance between accuracy and the amount of necessary training data. We propose an HD Computing-based Multimodality Emotion Recognition (HDC-MER). HDCMER maps real-valued features to binary HD vectors using a random nonlinear function, and further encodes them over time, and fuses across different modalities including GSR, ECG, and EEG. The experimental results show that, compared to the best method using the full training data, HDC-MER achieves higher classification accuracy for both valence (83.2% vs. 80.1%) and arousal (70.1% vs. 68.4%) using only 1/4 training data. HDC-MER also achieves at least 5% higher averaged accuracy compared to all the other methods in any point along the learning curve
4.4 A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7μW Cognitive Wake-Up from MRAM-Based State-Retentive Sleep Mode
partially_open12siThis work was supported in part by the EU Horizon 2020 Research and Innovation projects OPRECOMP (Open trans-PREcision COMPuting, g.a. no. 732631) and WiPLASH (Wireless Plasticity for Heterogeneous Massive Computer Architectures, g.a. no. 863337) and by the ECSEL Horizon 2020 project AI4DI (Artificial Intelligence for Digital Industry, g.a. no. 826060).The Internet-of-Things requires end-nodes with ultra-low-power always-on capability for long battery lifetime, as well as high performance, energy efficiency, and extreme flexibility to deal with complex and fast-evolving near-sensor analytics algorithms (NSAAs). We present Vega, an always-on IoT end-node SoC capable of scaling from a 1.7mu W fully retentive COGNITIVE sleep mode up to 32.2GOPS (@49.4mW) peak performance on NSAAs, including mobile DNN inference, exploiting 1.6MB of state- retentive SRAM, and 4MB of non-volatile MRAM. To meet the performance and flexibility requirements of NSAAs, the SoC features 10 RISC-V cores: one core for SoC and IO management and a 9-core cluster supporting multi-precision SIMD integer and floating- point computation. Two programmable machine-learning (ML) accelerators boost energy efficiency in sleep and active state, respectively.embargoed_20210902Rossi D.; Conti F.; Eggiman M.; Mach S.; Mauro A.D.; Guermandi M.; Tagliavini G.; Pullini A.; Loi I.; Chen J.; Flamand E.; Benini L.Rossi D.; Conti F.; Eggiman M.; Mach S.; Mauro A.D.; Guermandi M.; Tagliavini G.; Pullini A.; Loi I.; Chen J.; Flamand E.; Benini L
QubitHD: A Stochastic Acceleration Method for HD Computing-Based Machine Learning
Machine Learning algorithms based on Brain-inspired Hyperdimensional (HD)
computing imitate cognition by exploiting statistical properties of
high-dimensional vector spaces. It is a promising solution for achieving high
energy-efficiency in different machine learning tasks, such as classification,
semi-supervised learning and clustering. A weakness of existing HD
computing-based ML algorithms is the fact that they have to be binarized for
achieving very high energy-efficiency. At the same time, binarized models reach
lower classification accuracies. To solve the problem of the trade-off between
energy-efficiency and classification accuracy, we propose the QubitHD
algorithm. It stochastically binarizes HD-based algorithms, while maintaining
comparable classification accuracies to their non-binarized counterparts. The
FPGA implementation of QubitHD provides a 65% improvement in terms of
energy-efficiency, and a 95% improvement in terms of the training time, as
compared to state-of-the-art HD-based ML algorithms. It also outperforms
state-of-the-art low-cost classifiers (like Binarized Neural Networks) in terms
of speed and energy-efficiency by an order of magnitude during training and
inference.Comment: 8 pages, 7 figures, 3 table
Cellular Automata Can Reduce Memory Requirements of Collective-State Computing
Various non-classical approaches of distributed information processing, such
as neural networks, computation with Ising models, reservoir computing, vector
symbolic architectures, and others, employ the principle of collective-state
computing. In this type of computing, the variables relevant in a computation
are superimposed into a single high-dimensional state vector, the
collective-state. The variable encoding uses a fixed set of random patterns,
which has to be stored and kept available during the computation. Here we show
that an elementary cellular automaton with rule 90 (CA90) enables space-time
tradeoff for collective-state computing models that use random dense binary
representations, i.e., memory requirements can be traded off with computation
running CA90. We investigate the randomization behavior of CA90, in particular,
the relation between the length of the randomization period and the size of the
grid, and how CA90 preserves similarity in the presence of the initialization
noise. Based on these analyses we discuss how to optimize a collective-state
computing model, in which CA90 expands representations on the fly from short
seed patterns - rather than storing the full set of random patterns. The CA90
expansion is applied and tested in concrete scenarios using reservoir computing
and vector symbolic architectures. Our experimental results show that
collective-state computing with CA90 expansion performs similarly compared to
traditional collective-state models, in which random patterns are generated
initially by a pseudo-random number generator and then stored in a large
memory.Comment: 13 pages, 11 figure