1,041 research outputs found
Deep Spiking Neural Network model for time-variant signals classification: a real-time speech recognition approach
Speech recognition has become an important task
to improve the human-machine interface. Taking into account
the limitations of current automatic speech recognition systems,
like non-real time cloud-based solutions or power demand,
recent interest for neural networks and bio-inspired systems has
motivated the implementation of new techniques.
Among them, a combination of spiking neural networks and
neuromorphic auditory sensors offer an alternative to carry
out the human-like speech processing task. In this approach,
a spiking convolutional neural network model was implemented,
in which the weights of connections were calculated by training
a convolutional neural network with specific activation functions,
using firing rate-based static images with the spiking information
obtained from a neuromorphic cochlea.
The system was trained and tested with a large dataset
that contains ”left” and ”right” speech commands, achieving
89.90% accuracy. A novel spiking neural network model has been
proposed to adapt the network that has been trained with static
images to a non-static processing approach, making it possible
to classify audio signals and time series in real time.Ministerio de Economía y Competitividad TEC2016-77785-
Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network
In this paper, we explore the capabilities of a sound
classification system that combines both a novel FPGA cochlear
model implementation and a bio-inspired technique based on a
trained convolutional spiking network. The neuromorphic
auditory system that is used in this work produces a form of
representation that is analogous to the spike outputs of the
biological cochlea. The auditory system has been developed using
a set of spike-based processing building blocks in the frequency
domain. They form a set of band pass filters in the spike-domain
that splits the audio information in 128 frequency channels, 64
for each of two audio sources. Address Event Representation
(AER) is used to communicate the auditory system with the
convolutional spiking network. A layer of convolutional spiking
network is developed and trained on a computer with the ability
to detect two kinds of sound: artificial pure tones in the presence
of white noise and electronic musical notes. After the training
process, the presented system is able to distinguish the different
sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0
Interfacing PDM sensors with PFM spiking systems: application for Neuromorphic Auditory Sensors
In this paper we present a sub-system to convert
audio information from low-power MEMS microphones with
pulse density modulation (PDM) output into rate coded spike
streams. These spikes represent the input signal of a Neuromorphic
Auditory Sensor (NAS), which is implemented with Spike
Signal Processing (SSP) building blocks. For this conversion, we
have designed a HDL component for FPGA able to interface
with PDM microphones and converts their pulses to temporal
distributed spikes following a pulse frequency modulation (PFM)
scheme with an accurate configurable Inter-Spike-Interval. The
new FPGA component has been tested in two scenarios, first as a
stand-alone circuit for its characterization, and then it has been
integrated with a full NAS design to verify its behavior. This
PDM interface demands less than 1% of a Spartan 6 FPGA
resources and has a power consumption below 5mW.Ministerio de Economía y Competitividad TEC2016-77785-
A spiking neural network for real-time Spanish vowel phonemes recognition
This paper explores neuromorphic approach capabilities applied to real-time speech processing. A spiking
recognition neural network composed of three types of neurons is proposed. These neurons are based on an
integrative and fire model and are capable of recognizing auditory frequency patterns, such as vowel phonemes;
words are recognized as sequences of vowel phonemes. For demonstrating real-time operation, a complete
spiking recognition neural network has been described in VHDL for detecting certain Spanish words, and it has
been tested in a FPGA platform. This is a stand-alone and fully hardware system that allows to embed it in a
mobile system. To stimulate the network, a spiking digital-filter-based cochlea has been implemented in VHDL.
In the implementation, an Address Event Representation (AER) is used for transmitting information between
neurons.Ministerio de Economía y Competitividad TEC2012-37868-C04-02/0
A Binaural Neuromorphic Auditory Sensor for FPGA: A Spike Signal Processing Approach
This paper presents a new architecture, design
flow, and field-programmable gate array (FPGA) implementation
analysis of a neuromorphic binaural auditory sensor, designed
completely in the spike domain. Unlike digital cochleae that
decompose audio signals using classical digital signal processing
techniques, the model presented in this paper processes information
directly encoded as spikes using pulse frequency modulation
and provides a set of frequency-decomposed audio information
using an address-event representation interface. In this case,
a systematic approach to design led to a generic process for
building, tuning, and implementing audio frequency decomposers
with different features, facilitating synthesis with custom features.
This allows researchers to implement their own parameterized
neuromorphic auditory systems in a low-cost FPGA in order to
study the audio processing and learning activity that takes place
in the brain. In this paper, we present a 64-channel binaural
neuromorphic auditory system implemented in a Virtex-5 FPGA
using a commercial development board. The system was excited
with a diverse set of audio signals in order to analyze its response
and characterize its features. The neuromorphic auditory system
response times and frequencies are reported. The experimental
results of the proposed system implementation with 64-channel
stereo are: a frequency range between 9.6 Hz and 14.6 kHz
(adjustable), a maximum output event rate of 2.19 Mevents/s,
a power consumption of 29.7 mW, the slices requirements
of 11 141, and a system clock frequency of 27 MHz.Ministerio de Economía y Competitividad TEC2012-37868-C04-02Junta de Andalucía P12-TIC-130
Neuromorphic audio processing through real-time embedded spiking neural networks.
In this work novel speech recognition and audio processing systems based on a spiking artificial cochlea and neural networks are proposed and implemented. First, the biological behavior of the animal’s auditory system is analyzed and studied, along with the classical mechanisms of audio signal processing for sound classification, including Deep Learning techniques. Based on these studies, novel audio processing and automatic audio signal recognition systems are proposed, using a bio-inspired auditory sensor as input. A desktop software tool called NAVIS (Neuromorphic Auditory VIsualizer) for post-processing the information obtained from spiking cochleae was implemented, allowing to analyze these data for further research.
Next, using a 4-chip SpiNNaker hardware platform and Spiking Neural Networks, a system is proposed for classifying different time-independent audio signals, making use of a Neuromorphic Auditory Sensor and frequency studies obtained with NAVIS. To prove the robustness and analyze the limitations of the system, the input audios were disturbed, simulating extreme noisy environments.
Deep Learning mechanisms, particularly Convolutional Neural Networks, are trained and used to differentiate between healthy persons and pathological patients by detecting murmurs from heart recordings after integrating the spike information from the signals using a neuromorphic auditory sensor.
Finally, a similar approach is used to train Spiking Convolutional Neural Networks for speech recognition tasks. A novel SCNN architecture for timedependent signals classification is proposed, using a buffered layer that adapts the information from a real-time input domain to a static domain. The system was deployed on a 48-chip SpiNNaker platform.
Finally, the performance and efficiency of these systems were evaluated, obtaining conclusions and proposing improvements for future works.Premio Extraordinario de Doctorado U
Feasibility study of a hand guided robotic drill for cochleostomy
The concept of a hand guided robotic drill has been inspired by an automated, arm supported robotic drill recently applied in clinical practice to produce cochleostomies without penetrating the endosteum ready for inserting cochlear electrodes. The smart tactile sensing scheme within the drill enables precise control of the state of interaction between tissues and tools in real-time. This paper reports development studies of the hand guided robotic drill where the same consistent outcomes, augmentation of surgeon control and skill, and similar reduction of induced disturbances on the hearing organ are achieved. The device operates with differing presentation of tissues resulting from variation in anatomy and demonstrates the ability to control or avoid penetration of tissue layers as required and to respond to intended rather than involuntary motion of the surgeon operator. The advantage of hand guided over an arm supported system is that it offers flexibility in adjusting the drilling trajectory. This can be important to initiate cutting on a hard convex tissue surface without slipping and then to proceed on the desired trajectory after cutting has commenced. The results for trials on phantoms show that drill unit compliance is an important factor in the design
Speech Development by Imitation
The Double Cone Model (DCM) is a model
of how the brain transforms sensory input to
motor commands through successive stages of
data compression and expansion. We have
tested a subset of the DCM on speech recognition, production and imitation. The experiments show that the DCM is a good candidate
for an artificial speech processing system that
can develop autonomously. We show that the
DCM can learn a repertoire of speech sounds
by listening to speech input. It is also able to
link the individual elements of speech to sequences that can be recognized or reproduced,
thus allowing the system to imitate spoken
language
- …