Search CORE

88 research outputs found

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Auralization of an orchestra using multichannel and multisource technique (A)

Author: Rindel Jens Holger
Vigeant Michelle C.
Wang Lily M.
Publication venue
Publication date: 01/01/2006
Field of study

Online Research Database In Technology

South African sign language dataset development and translation : a glove-based approach

Author: Mcinnes Ben
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2014
Field of study

Includes bibliographical references.There has been a definite breakdown of communication between the hearing and the Deaf communities. This communication gap drastically effects many facets of a Deaf person’s life, including education, job opportunities and quality of life. Researchers have turned to technology in order to remedy this issue using Automatic Sign Language. While there has been successful research around the world, this is not possible in South Africa as there is no South African Sign Language (SASL) database available. This research aims to develop a SASL static gesture database using a data glove as the first step towards developing a comprehensive database that encapsulates the entire language. Unfortunately commercial data gloves are expensive and so as part of this research, a low-cost data glove will be developed for the application of Automatic Sign Language Translation. The database and data glove will be used together with Neural Networks to perform gesture classification. This will be done in order to evaluate the gesture data collected for the database. This research project has been broken down into three main sections; data glove development, database creation and gesture classification. The data glove was developed by critically reviewing the relevant literature, testing the sensors and then evaluating the overall glove for repeatability and reliability. The final data glove prototype was constructed and five participants were used to collect 31 different static gestures in three different scenarios, which range from isolated gesture collection to continuous data collection. This data was cleaned and used to train a neural network for the purpose of classification. Several training algorithms were chosen and compared to see which attained the highest classification accuracy. The data glove performed well and achieved results superior to some research and on par with other researchers’ results. The data glove achieved a repeatable angle range of 3.27 degrees resolution with a standard deviation of 1.418 degrees. This result is far below the specified 15 degrees resolution required for the research. The device remained low-cost and was more than $100 cheaper than other custom research data gloves and hundreds of dollars cheaper than commercial data gloves. A database was created using five participants and 1550 type 1 gestures, 465 type 2 gestures and 93 type 3 gestures were collected. The Resilient Back-Propagation and Levenberg-Marquardt training algorithms were considered as the training algorithms for the neural network. The Levenberg-Marquardt algorithm had a superior classification accuracy achieving 99.61%, 77.42% and 81.72% accuracy on the type 1, type 2 and type 3 data respectively

Cape Town University OpenUCT

Automatic Phoneme Recognition using Mel-Frequency Cepstral Coefficient and Dynamic Time Warping

Author: AHMED MD SABBIR
Publication venue: 'Pisa University Press'
Publication date: 26/02/2017
Field of study

A phoneme recognition process is performed by using the Mel-Frequency Cepstral Coefficient (MFCC) feature extraction technique and an unknown test pattern is compared with the pre-recorded reference pattern by using the Dynamic Time Warping (DTW) algorithm to determine the similarity between them

Electronic Thesis and Dissertation Archive - Università di Pisa

Hidden Markov Models

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

Directory of Open Access Books (DOAB)