Search CORE

878 research outputs found

Automatic voice recognition using traditional and artificial neural network approaches

Author: Botros Nazeih M.
Publication venue
Publication date
Field of study

The main objective of this research is to develop an algorithm for isolated-word recognition. This research is focused on digital signal analysis rather than linguistic analysis of speech. Features extraction is carried out by applying a Linear Predictive Coding (LPC) algorithm with order of 10. Continuous-word and speaker independent recognition will be considered in future study after accomplishing this isolated word research. To examine the similarity between the reference and the training sets, two approaches are explored. The first is implementing traditional pattern recognition techniques where a dynamic time warping algorithm is applied to align the two sets and calculate the probability of matching by measuring the Euclidean distance between the two sets. The second is implementing a backpropagation artificial neural net model with three layers as the pattern classifier. The adaptation rule implemented in this network is the generalized least mean square (LMS) rule. The first approach has been accomplished. A vocabulary of 50 words was selected and tested. The accuracy of the algorithm was found to be around 85 percent. The second approach is in progress at the present time

NASA Technical Reports Server

Recommended from our members

A digital neural network approach to speech recognition

Author: Haider Najmi Ghani
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/1989
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.This thesis presents two novel methods for isolated word speech recognition based on sub-word components. A digital neural network is the fundamental processing strategy in both methods. The first design is based on the 'Separate Segmentation & Labelling' (SS&L) approach. The spectral data of the input utterance is first segmented into phoneme-like units which are then time normalised by linear time normalisation. The neural network labels the time-normalised phoneme-like segments 78.36% recognition accuracy is achieved for the phoneme-like unit. In the second design, no time normalisation is required. After segmentation, recognition is performed by classifying the data in a window as it is slid one frame at a time, from the start to the end of of each phoneme-like segment in the utterance. 73.97% recognition accuracy for the phoneme-like unit is achieved in this application. The parameters of the neural net have been optimised for maximum recognition performance. A segmentation strategy using the sum of the difference in filterbank channel energy over successive spectra produced 80.27% correct segmentation of isolated utterances into phoneme-like units. A linguistic processor based on that of Kashyap & Mittal [84] enables 93.11% and 93.49% word recognition accuracy to be achieved for the SS&L and 'Sliding Window' recognisers respectively. The linguistic processor has been redesigned to make it portable so that it can be easily applied to any phoneme based isolated word speech recogniser.This work is funded by the Ministry of Science & Technology, Government of Pakistan

Brunel University Research Archive

Development of the Feature Extractor for Speech Recognition

Author: Añorga Irigoien Eneko
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/10/2009
Field of study

Projecte final de carrera realitzat en col.laboració amb University of MariborWith this diploma work we have attempted to give continuity to the previous work done by other researchers called, Voice Operating Intelligent Wheelchair – VOIC [1]. A development of a wheelchair controlled by voice is presented in this work and is designed for physically disabled people, who cannot control their movements. This work describes basic components of speech recognition and wheelchair control system. Going to the grain, a speech recognizer system is comprised of two distinct blocks, a Feature Extractor and a Recognizer. The present work is targeted at the realization of an adequate Feature Extractor block which uses a standard LPC Cepstrum coder, which translates the incoming speech into a trajectory in the LPC Cepstrum feature space, followed by a Self Organizing Map, which classifies the outcome of the coder in order to produce optimal trajectory representations of words in reduced dimension feature spaces. Experimental results indicate that trajectories on such reduced dimension spaces can provide reliable representations of spoken words. The Recognizer block is left for future researchers. The main contributions of this work have been the research and approach of a new technology for development issues and the realization of applications like a voice recorder and player and a complete Feature Extractor system

A Speech Recognition System for Embedded Applications Using the SOM and TS-SOM Networks

Author: Amauri H. Souza Júnior
Antonio T. Varela
Guilherme A. Barreto
Publication venue: 'IntechOpen'
Publication date: 21/01/2011
Field of study

IntechOpen

Progress in Speech Recognition for Romanian Language

Author: Corneliu-Octavian Dumitru
Inge Gavat
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Evaluation of preprocessors for neural network speaker verification

Author: Salleh Sheikh-Hussain
Publication venue: The University of Edinburgh
Publication date: 01/01/1997
Field of study

Edinburgh Research Archive

Patterns and symbols:a world through the eye of the machine

Author: Schomaker Lambertus
Publication venue: s.n.
Publication date: 01/01/2002
Field of study

ARTS repository - University of Groningen

Continuous speech recognition with modified learning vector quantization algorithm and two-level DP-matching

Author: Endo Mitsuru
Kido Ken\u27iti
Makino Shozo
Sone Toshio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/04/2010
Field of study

PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSIN

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)