Search CORE

58 research outputs found

Generalized Hidden Filter Markov Models Applied to Speaker Recognition

Author: Colombi John M.
Publication venue: AFIT Scholar
Publication date: 01/03/1996
Field of study

Classification of time series has wide Air Force, DoD and commercial interest, from automatic target recognition systems on munitions to recognition of speakers in diverse environments. The ability to effectively model the temporal information contained in a sequence is of paramount importance. Toward this goal, this research develops theoretical extensions to a class of stochastic models and demonstrates their effectiveness on the problem of text-independent (language constrained) speaker recognition. Specifically within the hidden Markov model architecture, additional constraints are implemented which better incorporate observation correlations and context, where standard approaches fail. Two methods of modeling correlations are developed, and their mathematical properties of convergence and reestimation are analyzed. These differ in modeling correlation present in the time samples and those present in the processed features, such as Mel frequency cepstral coefficients. The system models speaker dependent phonemes, making use of word dictionary grammars, and recognition is based on normalized log-likelihood Viterbi decoding. Both closed set identification and speaker verification using cohorts are performed on the YOHO database. YOHO is the only large scale, multiple-session, high-quality speech database for speaker authentication and contains over one hundred speakers stating combination locks. Equal error rates of 0.21% for males and 0.31% for females are demonstrated. A critical error analysis using a hypothesis test formulation provides the maximum number of errors observable while still meeting the goal error rates of 1% False Reject and 0.1% False Accept. Our system achieves this goal

AFTI Scholar (Air Force Institute of Technology)

CONNECTIONIST SPEECH RECOGNITION - A Hybrid Approach

Author: Bourlard Hervé
Morgan Nelson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/12/2013
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Speech and neural network dynamics

Author: Renals Stephen John
Publication venue: The University of Edinburgh
Publication date: 01/01/1990
Field of study

Edinburgh Research Archive

Global Maximum Likelihood Decoding with Hidden Markov Models

Author: Dean Richard A.
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/1990
Field of study

This thesis presents a summary of research in areas related to speech communications on degraded channels using very low data rate (VLR) digital voice coders. Background is presented on the nature of voice encoding, problems encountered with real world communications channels and some traditional solutions to these problems. Recent developments which use the Hidden Markov Model (HMM) and Vector Quantization (VQ) to enhance performance are reviewed. A proposal for a new channel decoding technique is then presented. This proposed technique uses the Hidden Markov Model in conjunction with a VLR voice encoder using Vector Quantization. It performs globally maximum likelihood estimates of received vectors over the joint region of received channel signals and possible vector decisions. Finally experimental results which are based on a simulation of the concept are presented.Electrical Engineerin

SHAREOK repository

Evaluation of preprocessors for neural network speaker verification

Author: Salleh Sheikh-Hussain
Publication venue: The University of Edinburgh
Publication date: 01/01/1997
Field of study

Edinburgh Research Archive

Continuous speech phoneme recognition using neural networks and grammar correction.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1995
Field of study

by Wai-Tat Fu.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 104-[109]).Chapter 1 --- INTRODUCTION --- p.1Chapter 1.1 --- Problem of Speech Recognition --- p.1Chapter 1.2 --- Why continuous speech recognition? --- p.5Chapter 1.3 --- Current status of continuous speech recognition --- p.6Chapter 1.4 --- Research Goal --- p.10Chapter 1.5 --- Thesis outline --- p.10Chapter 2 --- Current Approaches to Continuous Speech Recognition --- p.12Chapter 2.1 --- BASIC STEPS FOR CONTINUOUS SPEECH RECOGNITION --- p.12Chapter 2.2 --- THE HIDDEN MARKOV MODEL APPROACH --- p.16Chapter 2.2.1 --- Introduction --- p.16Chapter 2.2.2 --- Segmentation and Pattern Matching --- p.18Chapter 2.2.3 --- Word Formation and Syntactic Processing --- p.22Chapter 2.2.4 --- Discussion --- p.23Chapter 2.3 --- NEURAL NETWORK APPROACH --- p.24Chapter 2.3.1 --- Introduction --- p.24Chapter 2.3.2 --- Segmentation and Pattern Matching --- p.25Chapter 2.3.3 --- Discussion --- p.27Chapter 2.4 --- MLP/HMM HYBRID APPROACH --- p.28Chapter 2.4.1 --- Introduction --- p.28Chapter 2.4.2 --- Architecture of Hybrid MLP/HMM Systems --- p.29Chapter 2.4.3 --- Discussions --- p.30Chapter 2.5 --- SYNTACTIC GRAMMAR --- p.30Chapter 2.5.1 --- Introduction --- p.30Chapter 2.5.2 --- Word formation and Syntactic Processing --- p.31Chapter 2.5.3 --- Discussion --- p.32Chapter 2.6 --- SUMMARY --- p.32Chapter 3 --- Neural Network As Pattern Classifier --- p.34Chapter 3.1 --- INTRODUCTION --- p.34Chapter 3.2 --- TRAINING ALGORITHMS AND TOPOLOGIES --- p.35Chapter 3.2.1 --- Multilayer Perceptrons --- p.35Chapter 3.2.2 --- Recurrent Neural Networks --- p.39Chapter 3.2.3 --- Self-organizing Maps --- p.41Chapter 3.2.4 --- Learning Vector Quantization --- p.43Chapter 3.3 --- EXPERIMENTS --- p.44Chapter 3.3.1 --- The Data Set --- p.44Chapter 3.3.2 --- Preprocessing of the Speech Data --- p.45Chapter 3.3.3 --- The Pattern Classifiers --- p.50Chapter 3.4 --- RESULTS AND DISCUSSIONS --- p.53Chapter 4 --- High Level Context Information --- p.56Chapter 4.1 --- INTRODUCTION --- p.56Chapter 4.2 --- HIDDEN MARKOV MODEL APPROACH --- p.57Chapter 4.3 --- THE DYNAMIC PROGRAMMING APPROACH --- p.59Chapter 4.4 --- THE SYNTACTIC GRAMMAR APPROACH --- p.60Chapter 5 --- Finite State Grammar Network --- p.62Chapter 5.1 --- INTRODUCTION --- p.62Chapter 5.2 --- THE GRAMMAR COMPILATION --- p.63Chapter 5.2.1 --- Introduction --- p.63Chapter 5.2.2 --- K-Tails Clustering Method --- p.66Chapter 5.2.3 --- Inference of finite state grammar --- p.67Chapter 5.2.4 --- Error Correcting Parsing --- p.69Chapter 5.3 --- EXPERIMENT --- p.71Chapter 5.4 --- RESULTS AND DISCUSSIONS --- p.73Chapter 6 --- The Integrated System --- p.81Chapter 6.1 --- INTRODUCTION --- p.81Chapter 6.2 --- POSTPROCESSING OF NEURAL NETWORK OUTPUT --- p.82Chapter 6.2.1 --- Activation Threshold --- p.82Chapter 6.2.2 --- Duration Threshold --- p.85Chapter 6.2.3 --- Merging of Phoneme boundaries --- p.88Chapter 6.3 --- THE ERROR CORRECTING PARSER --- p.90Chapter 6.4 --- RESULTS AND DISCUSSIONS --- p.96Chapter 7 --- Conclusions --- p.101Bibliography --- p.10

CUHK Digital Repository

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study