Search CORE

2,779 research outputs found

A sticky HDP-HMM with application to speaker diarization

Author: Fox Emily B.
Jordan Michael I.
Sudderth Erik B.
Willsky Alan S.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to speaker diarization that builds on the hierarchical Dirichlet process hidden Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566--1581]. Although the basic HDP-HMM tends to over-segment the audio data---creating redundant states and rapidly switching among them---we describe an augmented HDP-HMM that provides effective control over the switching rate. We also show that this augmentation makes it possible to treat emission distributions nonparametrically. To scale the resulting architecture to realistic diarization problems, we develop a sampling algorithm that employs a truncated approximation of the Dirichlet process to jointly resample the full state sequence, greatly improving mixing rates. Working with a benchmark NIST data set, we show that our Bayesian nonparametric architecture yields state-of-the-art speaker diarization results.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS395 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

DSpace@MIT

ScholarlyCommons@Penn

Interleaved Factorial Non-Homogeneous Hidden Markov Models for Energy Disaggregation

Author: Goddard Nigel
Sutton Charles
Zhong Mingjun
Publication venue
Publication date: 30/06/2014
Field of study

To reduce energy demand in households it is useful to know which electrical appliances are in use at what times. Monitoring individual appliances is costly and intrusive, whereas data on overall household electricity use is more easily obtained. In this paper, we consider the energy disaggregation problem where a household's electricity consumption is disaggregated into the component appliances. The factorial hidden Markov model (FHMM) is a natural model to fit this data. We enhance this generic model by introducing two constraints on the state sequence of the FHMM. The first is to use a non-homogeneous Markov chain, modelling how appliance usage varies over the day, and the other is to enforce that at most one chain changes state at each time step. This yields a new model which we call the interleaved factorial non-homogeneous hidden Markov model (IFNHMM). We evaluated the ability of this model to perform disaggregation in an ultra-low frequency setting, over a data set of 251 English households. In this new setting, the IFNHMM outperforms the FHMM in terms of recovering the energy used by the component appliances, due to that stronger constraints have been imposed on the states of the hidden Markov chains. Interestingly, we find that the variability in model performance across households is significant, underscoring the importance of using larger scale data in the disaggregation problem.Comment: 5 pages, 1 figure, conference, The NIPS workshop on Machine Learning for Sustainability, Lake Tahoe, NV, USA, 201

arXiv.org e-Print Archive

CiteSeerX

Metropolis Sampling

Author: Acton
Andrieu
Andrieu
Barker
Beaumont
Beaumont
Bernardo
Bierkens
Billera
Box
Brockwell
Brooks
Bugallo
Burden
Cai
Calderhead
Casarin
Craiu
Craiu
Dagpunar
Dunn
Elvira
Elvira
Gamerman
Gasemyr
Gelman
Gentle
Gilks
Gilks
Gilks
Giordani
Green
Haario
Haario
Hastings
Holden
Jacob
Jaeckel
Johnson
Kirkpatrick
Kroese
Kroese
Kythe
Levine
Liu
Liu
Liu
Locatelli
Marjoram
Martino
Martino
Martino
Martino
Metropolis
Meyer
Møller
Neal
Nummelin
Peskun
Plybon
Propp
Robert
Robert
Roberts
Roberts
Roberts
Roberts
Roberts
Tierney
Tierney
Publication venue
Publication date: 15/04/2017
Field of study

Monte Carlo (MC) sampling methods are widely applied in Bayesian inference, system simulation and optimization problems. The Markov Chain Monte Carlo (MCMC) algorithms are a well-known class of MC methods which generate a Markov chain with the desired invariant distribution. In this document, we focus on the Metropolis-Hastings (MH) sampler, which can be considered as the atom of the MCMC techniques, introducing the basic notions and different properties. We describe in details all the elements involved in the MH algorithm and the most relevant variants. Several improvements and recent extensions proposed in the literature are also briefly discussed, providing a quick but exhaustive overview of the current Metropolis-based sampling's world.Comment: Wiley StatsRef-Statistics Reference Online, 201

arXiv.org e-Print Archive

Crossref