586 research outputs found

    State-dependent time warping in the trended hidden Markov model

    Full text link
    In this paper we present an algorithm for estimating state-dependent polynomial coefficients in the nonstationary-state hidden Markov model (or the trended HMM) which allows for the flexibility of linear time warping or scaling in individual model states. The need for the state-dependent time warping arises from the consideration that due to speaking rate variation and other temporal factors in speech, multiple state-segmented speech data sequences used for training a single set of polynomial coefficients often vary appreciably in their sequence lengths. The algorithm is developed based on a general framework with use of auxiliary parameters, which, of no interests in themselves, nevertheless provide an intermediate tool for achieving maximal accuracy for estimating the polynomial coefficients in the trended HMM. It is proved that the proposed estimation algorithm converges to a solution equivalent to the state-optimized maximum likelihood estimate. Effectiveness of the algorithm is demonstrated in experiments designed to fit a single trended HMM simultaneously to multiple sequences of speech data which are different renditions of the same word yet vary over a wide range in the sequence length. Speech recognition experiments have been performed based on the standard acoustic-phonetic TIMIT database. The speech recognition results demonstrate the advantages of the time-warping trended HMMs over the regular trended HMMs measured about 10 to 15% improvement in terms of the recognition rate.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/31358/1/0000269.pd

    A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

    Full text link
    This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches

    Spline-based nonparametric inference in general state-switching models

    Get PDF
    State‐switching models combine immense flexibility with relative mathematical simplicity and computational tractability and, as a consequence, have established themselves as general‐purpose models for time series data. In this paper, we provide an overview of ways to use penalized splines to allow for flexible nonparametric inference within state‐switching models, and provide a critical discussion of the use of corresponding classes of models. The methods are illustrated using animal acceleration data and energy price data.PostprintPeer reviewe

    Psychological Behavior Analysis Using Advanced Signal Processing Techniques for fMRI Data

    Get PDF
    Psychological analysis related to voluntary reciprocal trust games were obtained using functional magnetic resonance imaging (fMRI) hyperscanning for 44 pairs of strangers throughout 36 trust games (TG) and 16 control games (CG). Hidden Markov models (HMMs) are proposed to train and classify the fMRI data acquired from these brain regions and extract the essential features of the initial decision of the first player to trust or not trust the second player. These results are evaluated using the different versions of the multifold cross-validation technique and compared to other speech data and other advanced signal processing techniques including linear classification, support vector machines (SVMs), and HMMs. With above 80% classification accuracy for HMM as compared to no more than 66% classification accuracy of a linear classifier and SVM, the corresponding experimental results demonstrate that the HMMs can be adopted as an outstanding paradigm to predict the psychological financial (trust/non-trust) activities reflected by the neural responses recorded using fMRI. Additionally, extracting the specific decision period and clustering the continuous time series proved to increase the classification accuracy by almost 20%
    corecore