123 research outputs found
Bio-Inspired Multi-Layer Spiking Neural Network Extracts Discriminative Features from Speech Signals
Spiking neural networks (SNNs) enable power-efficient implementations due to
their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN
that uses unsupervised learning to extract discriminative features from speech
signals, which can subsequently be used in a classifier. The architecture
consists of a spiking convolutional/pooling layer followed by a fully connected
spiking layer for feature discovery. The convolutional layer of leaky,
integrate-and-fire (LIF) neurons represents primary acoustic features. The
fully connected layer is equipped with a probabilistic spike-timing-dependent
plasticity learning rule. This layer represents the discriminative features
through probabilistic, LIF neurons. To assess the discriminative power of the
learned features, they are used in a hidden Markov model (HMM) for spoken digit
recognition. The experimental results show performance above 96% that compares
favorably with popular statistical feature extraction methods. Our results
provide a novel demonstration of unsupervised feature acquisition in an SNN
An Efficient Threshold-Driven Aggregate-Label Learning Algorithm for Multimodal Information Processing
The aggregate-label learning paradigm tackles the long-standing temporary credit assignment (TCA) problem in neuroscience and machine learning, enabling spiking neural networks to learn multimodal sensory clues with delayed feedback signals. However, the existing aggregate-label learning algorithms only work for single spiking neurons, and with low learning efficiency, which limit their real-world applicability. To address these limitations, we first propose an efficient threshold-driven plasticity algorithm for spiking neurons, namely ETDP. It enables spiking neurons to generate the desired number of spikes that match the magnitude of delayed feedback signals and to learn useful multimodal sensory clues embedded within spontaneous spiking activities. Furthermore, we extend the ETDP algorithm to support multi-layer spiking neural networks (SNNs), which significantly improves the applicability of aggregate-label learning algorithms. We also validate the multi-layer ETDP learning algorithm in a multimodal computation framework for audio-visual pattern recognition. Experimental results on both synthetic and realistic datasets show significant improvements in the learning efficiency and model capacity over the existing aggregate-label learning algorithms. It, therefore, provides many opportunities for solving real-world multimodal pattern recognition tasks with spiking neural networks
Using K-fold cross validation proposed models for SpikeProp learning enhancements
Spiking Neural Network (SNN) uses individual spikes in time field to perform as well as to communicate computation in such a way as the actual neurons act. SNN was not studied earlier as it was considered too complicated and too hard to examine. Several limitations concerning the characteristics of SNN which were not researched earlier are now resolved since the introduction of SpikeProp in 2000 by Sander Bothe as a supervised SNN learning model. This paper defines the research developments of the enhancement Spikeprop learning using K-fold cross validation for datasets classification. Hence, this paper introduces acceleration factors of SpikeProp using Radius Initial Weight and Differential Evolution (DE) Initialization weights as proposed methods. In addition, training and testing using K-fold cross validation properties of the new proposed method were investigated using datasets obtained from Machine Learning Benchmark Repository as an improved Bohte's algorithm. A comparison of the performance was made between the proposed method and Backpropagation (BP) together with the Standard SpikeProp. The findings also reveal that the proposed method has better performance when compared to Standard SpikeProp as well as the BP for all datasets performed by K-fold cross validation for classification datasets
PC-SNN: Supervised Learning with Local Hebbian Synaptic Plasticity based on Predictive Coding in Spiking Neural Networks
Deemed as the third generation of neural networks, the event-driven Spiking
Neural Networks(SNNs) combined with bio-plausible local learning rules make it
promising to build low-power, neuromorphic hardware for SNNs. However, because
of the non-linearity and discrete property of spiking neural networks, the
training of SNN remains difficult and is still under discussion. Originating
from gradient descent, backprop has achieved stunning success in multi-layer
SNNs. Nevertheless, it is assumed to lack biological plausibility, while
consuming relatively high computational resources. In this paper, we propose a
novel learning algorithm inspired by predictive coding theory and show that it
can perform supervised learning fully autonomously and successfully as the
backprop, utilizing only local Hebbian plasticity. Furthermore, this method
achieves a favorable performance compared to the state-of-the-art multi-layer
SNNs: test accuracy of 99.25% for the Caltech Face/Motorbike dataset, 84.25%
for the ETH-80 dataset, 98.1% for the MNIST dataset and 98.5% for the
neuromorphic dataset: N-MNIST. Furthermore, our work provides a new perspective
on how supervised learning algorithms are directly implemented in spiking
neural circuitry, which may give some new insights into neuromorphological
calculation in neuroscience.Comment: 15 pages, 11fig
Recommended from our members
Biologically inspired speaker verification
Speaker verification is an active research problem that has been addressed using a variety of different classification techniques. However, in general, methods inspired by the human auditory system tend to show better verification performance than other methods. In this thesis three biologically inspired speaker verification algorithms are presented
Bio-inspired multisensory integration of social signals
Emotions understanding represents a core aspect of human communication. Our social behaviours
are closely linked to expressing our emotions and understanding others’ emotional and mental
states through social signals. Emotions are expressed in a multisensory manner, where humans
use social signals from different sensory modalities such as facial expression, vocal changes, or
body language. The human brain integrates all relevant information to create a new multisensory
percept and derives emotional meaning.
There exists a great interest for emotions recognition in various fields such as HCI, gaming,
marketing, and assistive technologies. This demand is driving an increase in research on multisensory
emotion recognition. The majority of existing work proceeds by extracting meaningful
features from each modality and applying fusion techniques either at a feature level or decision
level. However, these techniques are ineffective in translating the constant talk and feedback
between different modalities. Such constant talk is particularly crucial in continuous emotion
recognition, where one modality can predict, enhance and complete the other.
This thesis proposes novel architectures for multisensory emotions recognition inspired by
multisensory integration in the brain. First, we explore the use of bio-inspired unsupervised
learning for unisensory emotion recognition for audio and visual modalities. Then we propose
three multisensory integration models, based on different pathways for multisensory integration
in the brain; that is, integration by convergence, early cross-modal enhancement, and integration
through neural synchrony. The proposed models are designed and implemented using third generation
neural networks, Spiking Neural Networks (SNN) with unsupervised learning. The
models are evaluated using widely adopted, third-party datasets and compared to state-of-the-art
multimodal fusion techniques, such as early, late and deep learning fusion. Evaluation results
show that the three proposed models achieve comparable results to state-of-the-art supervised
learning techniques. More importantly, this thesis shows models that can translate a constant
talk between modalities during the training phase. Each modality can predict, complement and
enhance the other using constant feedback. The cross-talk between modalities adds an insight
into emotions compared to traditional fusion techniques
- …