1,126 research outputs found

    A spiking neural network for real-time Spanish vowel phonemes recognition

    Get PDF
    This paper explores neuromorphic approach capabilities applied to real-time speech processing. A spiking recognition neural network composed of three types of neurons is proposed. These neurons are based on an integrative and fire model and are capable of recognizing auditory frequency patterns, such as vowel phonemes; words are recognized as sequences of vowel phonemes. For demonstrating real-time operation, a complete spiking recognition neural network has been described in VHDL for detecting certain Spanish words, and it has been tested in a FPGA platform. This is a stand-alone and fully hardware system that allows to embed it in a mobile system. To stimulate the network, a spiking digital-filter-based cochlea has been implemented in VHDL. In the implementation, an Address Event Representation (AER) is used for transmitting information between neurons.Ministerio de Economía y Competitividad TEC2012-37868-C04-02/0

    Deep Spiking Neural Network model for time-variant signals classification: a real-time speech recognition approach

    Get PDF
    Speech recognition has become an important task to improve the human-machine interface. Taking into account the limitations of current automatic speech recognition systems, like non-real time cloud-based solutions or power demand, recent interest for neural networks and bio-inspired systems has motivated the implementation of new techniques. Among them, a combination of spiking neural networks and neuromorphic auditory sensors offer an alternative to carry out the human-like speech processing task. In this approach, a spiking convolutional neural network model was implemented, in which the weights of connections were calculated by training a convolutional neural network with specific activation functions, using firing rate-based static images with the spiking information obtained from a neuromorphic cochlea. The system was trained and tested with a large dataset that contains ”left” and ”right” speech commands, achieving 89.90% accuracy. A novel spiking neural network model has been proposed to adapt the network that has been trained with static images to a non-static processing approach, making it possible to classify audio signals and time series in real time.Ministerio de Economía y Competitividad TEC2016-77785-

    An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays

    Get PDF
    The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques. The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in Lithuanian speech recognizer. The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed. The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences

    LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

    Get PDF
    LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

    FPGA-based implementation of speech recognition for robocar control using MFCC

    Get PDF
    This research proposes a simulation of the logic series of speech recognition on the MFCC (Mel Frequency Spread Spectrum) based FPGA and Euclidean Distance to control the robotic car motion. The speech known would be used as a command to operate the robotic car. MFCC in this study was used in the feature extraction process, while Euclidean distance was applied in the feature classification process of each speech that later would be forwarded to the part of decision to give the control logic in robotic motor. The test that has been conducted showed that the logic series designed was precise here by measuring the Mel Frequency Warping and Power Cepstrum. With the achievement of logic design in this research proven with a comparison between the Matlab computation and Xilinx simulation, it enables to facilitate the researchers to continue its implementation to FPGA hardware

    Performance evaluation and implementations of MFCC, SVM and MLP algorithms in the FPGA board

    Get PDF
    One of the most difficult speech recognition tasks is accurate recognition of human-to-human communication. Advances in deep learning over the last few years have produced major speech improvements in recognition on the representative Switch-board conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This raises two issues - what is human performance, and how far down can we still drive speech recognition error rates? The main objective of this article is the development of a comparative study of the performance of Automatic Speech Recognition (ASR) algorithms using a database made up of a set of signals created by female and male speakers of different ages. We will also develop techniques for the Software and Hardware implementation of these algorithms and test them in an embedded electronic card based on a reconfigurable circuit (Field Programmable Gate Array FPGA). We will present an analysis of the results of classifications for the best Support Vector Machine architectures (SVM) and Artificial Neural Networks of Multi-Layer Perceptron (MLP). Following our analysis, we created NIOSII processors and we tested their operations as well as their characteristics. The characteristics of each processor are specified in this article (cost, size, speed, power consumption and complexity). At the end of this work, we physically implemented the architecture of the Mel Frequency Cepstral Coefficients (MFCC) extraction algorithm as well as the classification algorithm that provided the best results
    corecore