6,375 research outputs found

    A sticky HDP-HMM with application to speaker diarization

    Get PDF
    We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to speaker diarization that builds on the hierarchical Dirichlet process hidden Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566--1581]. Although the basic HDP-HMM tends to over-segment the audio data---creating redundant states and rapidly switching among them---we describe an augmented HDP-HMM that provides effective control over the switching rate. We also show that this augmentation makes it possible to treat emission distributions nonparametrically. To scale the resulting architecture to realistic diarization problems, we develop a sampling algorithm that employs a truncated approximation of the Dirichlet process to jointly resample the full state sequence, greatly improving mixing rates. Working with a benchmark NIST data set, we show that our Bayesian nonparametric architecture yields state-of-the-art speaker diarization results.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS395 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    HMM based scenario generation for an investment optimisation problem

    Get PDF
    This is the post-print version of the article. The official published version can be accessed from the link below - Copyright @ 2012 Springer-Verlag.The Geometric Brownian motion (GBM) is a standard method for modelling financial time series. An important criticism of this method is that the parameters of the GBM are assumed to be constants; due to this fact, important features of the time series, like extreme behaviour or volatility clustering cannot be captured. We propose an approach by which the parameters of the GBM are able to switch between regimes, more precisely they are governed by a hidden Markov chain. Thus, we model the financial time series via a hidden Markov model (HMM) with a GBM in each state. Using this approach, we generate scenarios for a financial portfolio optimisation problem in which the portfolio CVaR is minimised. Numerical results are presented.This study was funded by NET ACE at OptiRisk Systems

    Labor Market Entry and Earnings Dynamics: Bayesian Inference Using Mixtures-of-Experts Markov Chain Clustering

    Get PDF
    This paper analyzes patterns in the earnings development of young labor market entrants over their life cycle. We identify four distinctly different types of transition patterns between discrete earnings states in a large administrative data set. Further, we investigate the effects of labor market conditions at the time of entry on the probability of belonging to each transition type. To estimate our statistical model we use a model-based clustering approach. The statistical challenge in our application comes from the di±culty in extending distance-based clustering approaches to the problem of identify groups of similar time series in a panel of discrete-valued time series. We use Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter (2010), which is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to analyze group membership we present an extension to this approach by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule using a multinomial logit model.Labor Market Entry Conditions, Transition Data, Markov Chain Monte Carlo, Multinomial Logit, Panel Data, Auxiliary Mixture Sampler, Bayesian Statistics

    MCMC implementation for Bayesian hidden semi-Markov models with illustrative applications

    Get PDF
    Copyright © Springer 2013. The final publication is available at Springer via http://dx.doi.org/10.1007/s11222-013-9399-zHidden Markov models (HMMs) are flexible, well established models useful in a diverse range of applications. However, one potential limitation of such models lies in their inability to explicitly structure the holding times of each hidden state. Hidden semi-Markov models (HSMMs) are more useful in the latter respect as they incorporate additional temporal structure by explicit modelling of the holding times. However, HSMMs have generally received less attention in the literature, mainly due to their intensive computational requirements. Here a Bayesian implementation of HSMMs is presented. Recursive algorithms are proposed in conjunction with Metropolis-Hastings in such a way as to avoid sampling from the distribution of the hidden state sequence in the MCMC sampler. This provides a computationally tractable estimation framework for HSMMs avoiding the limitations associated with the conventional EM algorithm regarding model flexibility. Performance of the proposed implementation is demonstrated through simulation experiments as well as an illustrative application relating to recurrent failures in a network of underground water pipes where random effects are also included into the HSMM to allow for pipe heterogeneity

    Distributions associated with general runs and patterns in hidden Markov models

    Full text link
    This paper gives a method for computing distributions associated with patterns in the state sequence of a hidden Markov model, conditional on observing all or part of the observation sequence. Probabilities are computed for very general classes of patterns (competing patterns and generalized later patterns), and thus, the theory includes as special cases results for a large class of problems that have wide application. The unobserved state sequence is assumed to be Markovian with a general order of dependence. An auxiliary Markov chain is associated with the state sequence and is used to simplify the computations. Two examples are given to illustrate the use of the methodology. Whereas the first application is more to illustrate the basic steps in applying the theory, the second is a more detailed application to DNA sequences, and shows that the methods can be adapted to include restrictions related to biological knowledge.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS125 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore