Search CORE

183,469 research outputs found

An autopoietic approach to the development of speech recognition (pendekatan autopoietic dalam pembangunan pengecaman suara)

Author: Ahmad Abd. Manan
Publication venue: Fakulti Sains Komputer dan Sistem Maklumat
Publication date: 03/12/2006
Field of study

The focus of research here is on the implementation of speech recognition through an autopoietic approach. The work done here has culminated in the introduction of a neural network architecture named Homunculus Network. This network was used in the development of a speech recognition system for Bahasa Melayu. The speech recognition system is an isolated-word, phoneme-level speech recognizer that is speaker independent and has a vocabulary of 15 words. The research done has identified some issues worth further work later. These issues are also the basis for the design and the development of the new autopoietic speech recognition system

Universiti Teknologi Malaysia Institutional Repository

Digit recognition using neural networks

Author: Jantan Adznan
Tan Chin Luh
Publication venue: Faculty of Computer Science and Information Technology, University of Malaya
Publication date: 01/01/2004
Field of study

This paper investigates the use of feed-forward multi-layer perceptrons trained by back-propagation in speech recognition. Besides this, the paper also proposes an automatic technique for both training and recognition. The use of neural networks for speaker independent isolated word recognition on small vocabularies is studied and an automated system from the training stage to the recognition stage without the need of manual cropping for speech signals is developed to evaluate the performance of the automatic speech recognition (ASR) system. Linear predictive coding (LPC) has been applied to represent speech signal in frames in early stage. Features from the selected frames are used to train multilayer perceptrons (MLP) using back-propagation. The same routine is applied to the speech signal during the recognition stage and unknown test patterns are classified to the nearest patterns. In short, the selected frames represent the local features of the speech signal and all of them contribute to the global similarity for the whole speech signal. The analysis, design and development of the automation system are done in MATLAB, in which an isolated word speaker independent digits recogniser is developed

Universiti Putra Malaysia Institutional Repository

Towards deep learning on speech recognition for Khmer language

Author: Lim Chanmann
Publication venue: University of Missouri--Columbia
Publication date
Field of study

In order to perform speech recognition well, a huge amount of transcribed speech and textual data in the target language must be available for system training. The high demand for language resources constrains the development of speech recognition systems for new languages. In this thesis the development of a low-resourced isolated-word recognition system for "Khmer" language is investigated. Speech data, collected via mobile phone, containing 194 vocabulary words is used in our experiments. Data pre-processing based on Voice Activity Detection (VAD) is discussed. As by-products of this work, phoneme based pronunciation lexicon and state tying questions set for Khmer speech recognizer are built from scratch. In addition to the conventional statistical acoustic modeling using Gaussian Mixture Model and hidden Markov Model (GMMHMM), a hybrid acoustic model based on Deep Neural Network (DNN-HMM) trained to predict contextdependent triphone states is evaluated. Dropout is used to improve the robustness of the DNN, and crosslingual transfer learning that makes use of auxiliary training data in English is also investigated. As the first effort in using DNN-HMM for low-resourced isolated-word recognition for Khmer language, the system currently performs at 93.31% word accuracy in speaker-independent mode on our test set

University of Missouri: MOspace

Speech recognition system using MATLAB : design, implementation, and samples codes

Author: Abushariah Ahmad A. M.
Gunawan Teddy Surya
Publication venue: Lambert Academic Publishing
Publication date: 01/01/2011
Field of study

Research in automatic speech recognition has been done for almost four decades. Over the past decades, the development of speech recognition applications gives invaluable contributions. Speech has the potential to be a better interface than other computing devices used such as keyboard or mouse. This project aims to develop automated English digits speech recognition system. The project relies heavily on the well known and widely used statistical method in characterizing the speech pattern, the Hidden Markov Model (HMM), which provides a highly reliable way for recognizing speech. This project discusses the theory of HMM and then extends the ideas to the development and implementation by applying this method in computational speech recognition. Basically, the system is able to recognize the spoken utterances by translating the speech waveform into a set of feature vectors using Mel Frequency Cepstral Coefficients (MFCC) technique, which then estimates the observation likelihood by using the Forward algorithm. The HMM parameters are estimated by applying the Baum Welch algorithm on previously trained samples. The most likely sequence is then decoded using Viterbi algorithm, thus producing the recognized word. This project focuses on all English digits from (Zero through Nine), which is based on isolated words structure. Two modules were developed, namely the isolated words speech recognition and the continuous speech recognition. Both modules were tested in both clean and noisy environments and showed relatively successful recognition rates. In clean environment and isolated words speech recognition module, the multi-speaker mode achieved 99.5% whereas the speaker-independent mode achieved 79.5%. In clean environment and continuous speech recognition module, the multi-speaker mode achieved 70% whereas the speaker-independent mode achieved 55%. However in noisy environment and isolated words speech recognition module, the multi-speaker mode achieved 88% whereas the speaker-independent mode achieved 67%. In noisy environment and continuous speech recognition module, the multi-speaker mode achieved 92.5% whereas the speaker-independent mode achieved 75%. These recognition rates are relatively successful if compared to similar systems

The International Islamic University Malaysia Repository

Speech recognition for smart homes

Author: McLoughlin Ian Vince
Sharifzadeh Hamid Reza
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Kent Academic Repository

New Method for Optimization of License Plate Recognition system with Use of Edge Detection and Connected Component

Author: Azad Reza
Shayegh Hamid Reza
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/07/2014
Field of study

License Plate recognition plays an important role on the traffic monitoring and parking management systems. In this paper, a fast and real time method has been proposed which has an appropriate application to find tilt and poor quality plates. In the proposed method, at the beginning, the image is converted into binary mode using adaptive threshold. Then, by using some edge detection and morphology operations, plate number location has been specified. Finally, if the plat has tilt, its tilt is removed away. This method has been tested on another paper data set that has different images of the background, considering distance, and angel of view so that the correct extraction rate of plate reached at 98.66%.Comment: 3rd IEEE International Conference on Computer and Knowledge Engineering (ICCKE 2013), October 31 & November 1, 2013, Ferdowsi Universit Mashha

arXiv.org e-Print Archive

Crossref