1,041 research outputs found

    Deep Spiking Neural Network model for time-variant signals classification: a real-time speech recognition approach

    Get PDF
    Speech recognition has become an important task to improve the human-machine interface. Taking into account the limitations of current automatic speech recognition systems, like non-real time cloud-based solutions or power demand, recent interest for neural networks and bio-inspired systems has motivated the implementation of new techniques. Among them, a combination of spiking neural networks and neuromorphic auditory sensors offer an alternative to carry out the human-like speech processing task. In this approach, a spiking convolutional neural network model was implemented, in which the weights of connections were calculated by training a convolutional neural network with specific activation functions, using firing rate-based static images with the spiking information obtained from a neuromorphic cochlea. The system was trained and tested with a large dataset that contains ”left” and ”right” speech commands, achieving 89.90% accuracy. A novel spiking neural network model has been proposed to adapt the network that has been trained with static images to a non-static processing approach, making it possible to classify audio signals and time series in real time.Ministerio de Economía y Competitividad TEC2016-77785-

    Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network

    Get PDF
    In this paper, we explore the capabilities of a sound classification system that combines both a novel FPGA cochlear model implementation and a bio-inspired technique based on a trained convolutional spiking network. The neuromorphic auditory system that is used in this work produces a form of representation that is analogous to the spike outputs of the biological cochlea. The auditory system has been developed using a set of spike-based processing building blocks in the frequency domain. They form a set of band pass filters in the spike-domain that splits the audio information in 128 frequency channels, 64 for each of two audio sources. Address Event Representation (AER) is used to communicate the auditory system with the convolutional spiking network. A layer of convolutional spiking network is developed and trained on a computer with the ability to detect two kinds of sound: artificial pure tones in the presence of white noise and electronic musical notes. After the training process, the presented system is able to distinguish the different sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0

    Interfacing PDM sensors with PFM spiking systems: application for Neuromorphic Auditory Sensors

    Get PDF
    In this paper we present a sub-system to convert audio information from low-power MEMS microphones with pulse density modulation (PDM) output into rate coded spike streams. These spikes represent the input signal of a Neuromorphic Auditory Sensor (NAS), which is implemented with Spike Signal Processing (SSP) building blocks. For this conversion, we have designed a HDL component for FPGA able to interface with PDM microphones and converts their pulses to temporal distributed spikes following a pulse frequency modulation (PFM) scheme with an accurate configurable Inter-Spike-Interval. The new FPGA component has been tested in two scenarios, first as a stand-alone circuit for its characterization, and then it has been integrated with a full NAS design to verify its behavior. This PDM interface demands less than 1% of a Spartan 6 FPGA resources and has a power consumption below 5mW.Ministerio de Economía y Competitividad TEC2016-77785-

    A spiking neural network for real-time Spanish vowel phonemes recognition

    Get PDF
    This paper explores neuromorphic approach capabilities applied to real-time speech processing. A spiking recognition neural network composed of three types of neurons is proposed. These neurons are based on an integrative and fire model and are capable of recognizing auditory frequency patterns, such as vowel phonemes; words are recognized as sequences of vowel phonemes. For demonstrating real-time operation, a complete spiking recognition neural network has been described in VHDL for detecting certain Spanish words, and it has been tested in a FPGA platform. This is a stand-alone and fully hardware system that allows to embed it in a mobile system. To stimulate the network, a spiking digital-filter-based cochlea has been implemented in VHDL. In the implementation, an Address Event Representation (AER) is used for transmitting information between neurons.Ministerio de Economía y Competitividad TEC2012-37868-C04-02/0

    A Binaural Neuromorphic Auditory Sensor for FPGA: A Spike Signal Processing Approach

    Get PDF
    This paper presents a new architecture, design flow, and field-programmable gate array (FPGA) implementation analysis of a neuromorphic binaural auditory sensor, designed completely in the spike domain. Unlike digital cochleae that decompose audio signals using classical digital signal processing techniques, the model presented in this paper processes information directly encoded as spikes using pulse frequency modulation and provides a set of frequency-decomposed audio information using an address-event representation interface. In this case, a systematic approach to design led to a generic process for building, tuning, and implementing audio frequency decomposers with different features, facilitating synthesis with custom features. This allows researchers to implement their own parameterized neuromorphic auditory systems in a low-cost FPGA in order to study the audio processing and learning activity that takes place in the brain. In this paper, we present a 64-channel binaural neuromorphic auditory system implemented in a Virtex-5 FPGA using a commercial development board. The system was excited with a diverse set of audio signals in order to analyze its response and characterize its features. The neuromorphic auditory system response times and frequencies are reported. The experimental results of the proposed system implementation with 64-channel stereo are: a frequency range between 9.6 Hz and 14.6 kHz (adjustable), a maximum output event rate of 2.19 Mevents/s, a power consumption of 29.7 mW, the slices requirements of 11 141, and a system clock frequency of 27 MHz.Ministerio de Economía y Competitividad TEC2012-37868-C04-02Junta de Andalucía P12-TIC-130

    Neuromorphic audio processing through real-time embedded spiking neural networks.

    Get PDF
    In this work novel speech recognition and audio processing systems based on a spiking artificial cochlea and neural networks are proposed and implemented. First, the biological behavior of the animal’s auditory system is analyzed and studied, along with the classical mechanisms of audio signal processing for sound classification, including Deep Learning techniques. Based on these studies, novel audio processing and automatic audio signal recognition systems are proposed, using a bio-inspired auditory sensor as input. A desktop software tool called NAVIS (Neuromorphic Auditory VIsualizer) for post-processing the information obtained from spiking cochleae was implemented, allowing to analyze these data for further research. Next, using a 4-chip SpiNNaker hardware platform and Spiking Neural Networks, a system is proposed for classifying different time-independent audio signals, making use of a Neuromorphic Auditory Sensor and frequency studies obtained with NAVIS. To prove the robustness and analyze the limitations of the system, the input audios were disturbed, simulating extreme noisy environments. Deep Learning mechanisms, particularly Convolutional Neural Networks, are trained and used to differentiate between healthy persons and pathological patients by detecting murmurs from heart recordings after integrating the spike information from the signals using a neuromorphic auditory sensor. Finally, a similar approach is used to train Spiking Convolutional Neural Networks for speech recognition tasks. A novel SCNN architecture for timedependent signals classification is proposed, using a buffered layer that adapts the information from a real-time input domain to a static domain. The system was deployed on a 48-chip SpiNNaker platform. Finally, the performance and efficiency of these systems were evaluated, obtaining conclusions and proposing improvements for future works.Premio Extraordinario de Doctorado U

    Feasibility study of a hand guided robotic drill for cochleostomy

    Get PDF
    The concept of a hand guided robotic drill has been inspired by an automated, arm supported robotic drill recently applied in clinical practice to produce cochleostomies without penetrating the endosteum ready for inserting cochlear electrodes. The smart tactile sensing scheme within the drill enables precise control of the state of interaction between tissues and tools in real-time. This paper reports development studies of the hand guided robotic drill where the same consistent outcomes, augmentation of surgeon control and skill, and similar reduction of induced disturbances on the hearing organ are achieved. The device operates with differing presentation of tissues resulting from variation in anatomy and demonstrates the ability to control or avoid penetration of tissue layers as required and to respond to intended rather than involuntary motion of the surgeon operator. The advantage of hand guided over an arm supported system is that it offers flexibility in adjusting the drilling trajectory. This can be important to initiate cutting on a hard convex tissue surface without slipping and then to proceed on the desired trajectory after cutting has commenced. The results for trials on phantoms show that drill unit compliance is an important factor in the design

    Speech Development by Imitation

    Get PDF
    The Double Cone Model (DCM) is a model of how the brain transforms sensory input to motor commands through successive stages of data compression and expansion. We have tested a subset of the DCM on speech recognition, production and imitation. The experiments show that the DCM is a good candidate for an artificial speech processing system that can develop autonomously. We show that the DCM can learn a repertoire of speech sounds by listening to speech input. It is also able to link the individual elements of speech to sequences that can be recognized or reproduced, thus allowing the system to imitate spoken language

    Analogue CMOS Cochlea Systems: A Historic Retrospective

    Get PDF
    corecore