Search CORE

48 research outputs found

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

Evaluation of preprocessors for neural network speaker verification

Author: Salleh Sheikh-Hussain
Publication venue: The University of Edinburgh
Publication date: 01/01/1997
Field of study

Edinburgh Research Archive

Multispectral segmentation of magnetic resonance images of the human brain

Author: Alirezaie S. M. Javad
Publication venue: 'University of Waterloo'
Publication date: 01/01/1997
Field of study

University of Waterloo's Institutional Repository

Audio Event Classification for Urban Soundscape Analysis

Author: Stammers Jon
Publication venue: University of York
Publication date: 01/01/2011
Field of study

The study of urban soundscapes has gained momentum in recent years as more people become concerned with the level of noise around them and the negative impact this can have on comfort. Monitoring the sounds present in a sonic environment can be a laborious and time–consuming process if performed manually. Therefore, techniques for automated signal identification are gaining importance if soundscapes are to be objectively monitored. This thesis presents a novel approach to feature extraction for the purpose of classifying urban audio events, adding to the library of techniques already established in the field. The research explores how techniques with their origins in the encoding of speech signals can be adapted to represent the complex everyday sounds all around us to allow accurate classification. The analysis methods developed herein are based on the zero–crossings information contained within a signal. Originally developed for the classification of bioacoustic signals, the codebook of Time–Domain Signal Coding (TDSC) has its band–limited restrictions removed to become more generic. Classification using features extracted with the new codebook achieves accuracies of over 80% when combined with a Multilayer Perceptron classifier. Further advancements are made to the standard TDSC algorithm, drawing inspiration from wavelets, resulting in a novel dyadic representation of time–domain features. Carrying the label of Multiscale TDSC (MTDSC), classification accuracies of 70% are achieved using these features. Recommendations for further work focus on expanding the library of training data to improve the accuracy of the classification system. Further research into classifier design is also suggested

White Rose E-theses Online

OpenGrey Repository

Continuous speech phoneme recognition using neural networks and grammar correction.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1995
Field of study

by Wai-Tat Fu.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 104-[109]).Chapter 1 --- INTRODUCTION --- p.1Chapter 1.1 --- Problem of Speech Recognition --- p.1Chapter 1.2 --- Why continuous speech recognition? --- p.5Chapter 1.3 --- Current status of continuous speech recognition --- p.6Chapter 1.4 --- Research Goal --- p.10Chapter 1.5 --- Thesis outline --- p.10Chapter 2 --- Current Approaches to Continuous Speech Recognition --- p.12Chapter 2.1 --- BASIC STEPS FOR CONTINUOUS SPEECH RECOGNITION --- p.12Chapter 2.2 --- THE HIDDEN MARKOV MODEL APPROACH --- p.16Chapter 2.2.1 --- Introduction --- p.16Chapter 2.2.2 --- Segmentation and Pattern Matching --- p.18Chapter 2.2.3 --- Word Formation and Syntactic Processing --- p.22Chapter 2.2.4 --- Discussion --- p.23Chapter 2.3 --- NEURAL NETWORK APPROACH --- p.24Chapter 2.3.1 --- Introduction --- p.24Chapter 2.3.2 --- Segmentation and Pattern Matching --- p.25Chapter 2.3.3 --- Discussion --- p.27Chapter 2.4 --- MLP/HMM HYBRID APPROACH --- p.28Chapter 2.4.1 --- Introduction --- p.28Chapter 2.4.2 --- Architecture of Hybrid MLP/HMM Systems --- p.29Chapter 2.4.3 --- Discussions --- p.30Chapter 2.5 --- SYNTACTIC GRAMMAR --- p.30Chapter 2.5.1 --- Introduction --- p.30Chapter 2.5.2 --- Word formation and Syntactic Processing --- p.31Chapter 2.5.3 --- Discussion --- p.32Chapter 2.6 --- SUMMARY --- p.32Chapter 3 --- Neural Network As Pattern Classifier --- p.34Chapter 3.1 --- INTRODUCTION --- p.34Chapter 3.2 --- TRAINING ALGORITHMS AND TOPOLOGIES --- p.35Chapter 3.2.1 --- Multilayer Perceptrons --- p.35Chapter 3.2.2 --- Recurrent Neural Networks --- p.39Chapter 3.2.3 --- Self-organizing Maps --- p.41Chapter 3.2.4 --- Learning Vector Quantization --- p.43Chapter 3.3 --- EXPERIMENTS --- p.44Chapter 3.3.1 --- The Data Set --- p.44Chapter 3.3.2 --- Preprocessing of the Speech Data --- p.45Chapter 3.3.3 --- The Pattern Classifiers --- p.50Chapter 3.4 --- RESULTS AND DISCUSSIONS --- p.53Chapter 4 --- High Level Context Information --- p.56Chapter 4.1 --- INTRODUCTION --- p.56Chapter 4.2 --- HIDDEN MARKOV MODEL APPROACH --- p.57Chapter 4.3 --- THE DYNAMIC PROGRAMMING APPROACH --- p.59Chapter 4.4 --- THE SYNTACTIC GRAMMAR APPROACH --- p.60Chapter 5 --- Finite State Grammar Network --- p.62Chapter 5.1 --- INTRODUCTION --- p.62Chapter 5.2 --- THE GRAMMAR COMPILATION --- p.63Chapter 5.2.1 --- Introduction --- p.63Chapter 5.2.2 --- K-Tails Clustering Method --- p.66Chapter 5.2.3 --- Inference of finite state grammar --- p.67Chapter 5.2.4 --- Error Correcting Parsing --- p.69Chapter 5.3 --- EXPERIMENT --- p.71Chapter 5.4 --- RESULTS AND DISCUSSIONS --- p.73Chapter 6 --- The Integrated System --- p.81Chapter 6.1 --- INTRODUCTION --- p.81Chapter 6.2 --- POSTPROCESSING OF NEURAL NETWORK OUTPUT --- p.82Chapter 6.2.1 --- Activation Threshold --- p.82Chapter 6.2.2 --- Duration Threshold --- p.85Chapter 6.2.3 --- Merging of Phoneme boundaries --- p.88Chapter 6.3 --- THE ERROR CORRECTING PARSER --- p.90Chapter 6.4 --- RESULTS AND DISCUSSIONS --- p.96Chapter 7 --- Conclusions --- p.101Bibliography --- p.10

CUHK Digital Repository

ETC\_verif : un environnement multi-agents de reconnaissance automatique de la parole en continu

Author: Cochard Jean-Luc
Vial Murielle
Publication venue
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne