3,458 research outputs found

    Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model

    Get PDF
    In this paper, a new hierarchical Bayesian speaker adaptation method called HMAP is proposed that combines the advantages of three conventional algorithms, maximum a posteriori (MAP), maximum-likelihood linear regression (MLLR), and eigenvoice, resulting in excellent performance across a wide range of adaptation conditions. The new method efficiently utilizes intra-speaker and inter-speaker correlation information through modeling phone and speaker subspaces in a consistent hierarchical Bayesian way. The phone variations for a specific speaker are assumed to be located in a low-dimensional subspace. The phone coordinate, which is shared among different speakers, implicitly contains the intra-speaker correlation information. For a specific speaker, the phone variation, represented by speaker-dependent eigenphones, are concatenated into a supervector. The eigenphone supervector space is also a low dimensional speaker subspace, which contains inter-speaker correlation information. Using principal component analysis (PCA), a new hierarchical probabilistic model for the generation of the speech observations is obtained. Speaker adaptation based on the new hierarchical model is derived using the maximum a posteriori criterion in a top-down manner. Both batch adaptation and online adaptation schemes are proposed. With tuned parameters, the new method can handle varying amounts of adaptation data automatically and efficiently. Experimental results on a Mandarin Chinese continuous speech recognition task show good performance under all testing conditions

    On adaptive decision rules and decision parameter adaptation for automatic speech recognition

    Get PDF
    Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing.published_or_final_versio

    On-line adaptive learning of the correlated continuous density hidden Markov models for speech recognition

    Get PDF
    We extend our previously proposed quasi-Bayes adaptive learning framework to cope with the correlated continuous density hidden Markov models (HMMs) with Gaussian mixture state observation densities in which all mean vectors are assumed to be correlated and have a joint prior distribution. A successive approximation algorithm is proposed to implement the correlated mean vectors' updating. As an example, by applying the method to an on-line speaker adaptation application, the algorithm is experimentally shown to be asymptotically convergent as well as being able to enhance the efficiency and the effectiveness of the Bayes learning by taking into account the correlation information between different model parameters. The technique can be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, transducers, environments, and so on.published_or_final_versio

    On-Line Bayesian Speaker Adaptation By Using Tree-Structured Transformation and Robust Priors

    Get PDF
    This paper presents new results by using our previously proposed on-line Bayesian learning approach for affine transformation parameter estimation in speaker adaptation. The on-line Bayesian learning technique allows updating parameter estimates after each utterance and it can accommodate flexible forms of transformation functions as well as prior probability density functions. We show through experimental results the robustness of heavy tailed priors to mismatch in prior density estimation. We also show that by properly choosing the transformation matrices and depths of hierarchical trees, recognition performance improved significantly

    Online adaptive learning of continuous-density hidden Markov models based on multiple-stream prior evolution and posterior pooling

    Get PDF
    We introduce a new adaptive Bayesian learning framework, called multiple-stream prior evolution and posterior pooling, for online adaptation of the continuous density hidden Markov model (CDHMM) parameters. Among three architectures we proposed for this framework, we study in detail a specific two stream system where linear transformations are applied to the mean vectors of the CDHMMs to control the evolution of their prior distribution. This new stream of prior distribution can be combined with another stream of prior distribution evolved without any constraints applied. In a series of speaker adaptation experiments on the task of continuous Mandarin speech recognition, we show that the new adaptation algorithm achieves a similar fast-adaptation performance as that of the incremental maximum likelihood linear regression (MLLR) in the case of small amount of adaptation data, while maintains the good asymptotic convergence property as that of our previously proposed quasi-Bayes adaptation algorithms.published_or_final_versio

    A review of domain adaptation without target labels

    Full text link
    Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure
    • …
    corecore