1,987 research outputs found
Clustering hidden Markov models with variational HEM
The hidden Markov model (HMM) is a widely-used generative model that copes
with sequential data, assuming that each observation is conditioned on the
state of a hidden Markov chain. In this paper, we derive a novel algorithm to
cluster HMMs based on the hierarchical EM (HEM) algorithm. The proposed
algorithm i) clusters a given collection of HMMs into groups of HMMs that are
similar, in terms of the distributions they represent, and ii) characterizes
each group by a "cluster center", i.e., a novel HMM that is representative for
the group, in a manner that is consistent with the underlying generative model
of the HMM. To cope with intractable inference in the E-step, the HEM algorithm
is formulated as a variational optimization problem, and efficiently solved for
the HMM case by leveraging an appropriate variational approximation. The
benefits of the proposed algorithm, which we call variational HEM (VHEM), are
demonstrated on several tasks involving time-series data, such as hierarchical
clustering of motion capture sequences, and automatic annotation and retrieval
of music and of online hand-writing data, showing improvements over current
methods. In particular, our variational HEM algorithm effectively leverages
large amounts of data when learning annotation models by using an efficient
hierarchical estimation procedure, which reduces learning times and memory
requirements, while improving model robustness through better regularization.Comment: 44 page
Parsimonious HMMs for Offline Handwritten Chinese Text Recognition
Recently, hidden Markov models (HMMs) have achieved promising results for
offline handwritten Chinese text recognition. However, due to the large
vocabulary of Chinese characters with each modeled by a uniform and fixed
number of hidden states, a high demand of memory and computation is required.
In this study, to address this issue, we present parsimonious HMMs via the
state tying which can fully utilize the similarities among different Chinese
characters. Two-step algorithm with the data-driven question-set is adopted to
generate the tied-state pool using the likelihood measure. The proposed
parsimonious HMMs with both Gaussian mixture models (GMMs) and deep neural
networks (DNNs) as the emission distributions not only lead to a compact model
but also improve the recognition accuracy via the data sharing for the tied
states and the confusion decreasing among state classes. Tested on ICDAR-2013
competition database, in the best configured case, the new parsimonious DNN-HMM
can yield a relative character error rate (CER) reduction of 6.2%, 25%
reduction of model size and 60% reduction of decoding time over the
conventional DNN-HMM. In the compact setting case of average 1-state HMM, our
parsimonious DNN-HMM significantly outperforms the conventional DNN-HMM with a
relative CER reduction of 35.5%.Comment: Accepted by ICFHR201
Inference of Dynamic Regimes in the Microbiome
Many studies have been performed to characterize the dynamics and stability
of the microbiome across a range of environmental contexts [Costello et al.,
2012, Faust et al., 2015]. For example, it is often of interest to identify
time intervals within which certain subsets of taxa have an interesting pattern
of behavior. Viewed abstractly, these problems often have a flavor not just of
time series modeling but also of regime detection, a problem with a rich
history across a variety of applications, including speech recognition [Fox et
al., 2011], finance [Lee, 2009], EEG analysis [Camilleri et al., 2014], and
geophysics [Weatherley and Mora, 2002]. In spite of the parallels, regime
detection methods are rarely used in microbiome analysis, most likely due to
the fact that references for these methods are scattered across several
literatures, descriptions are inaccessible outside limited research
communities, and implementations are difficult to come across.
We distill the core ideas of different regime detection methods, provide
example applications, and share reproducible code, making these techniques more
accessible to microbiome researchers. We re-analyze data of Dethlefsen and
Relman [2011], a study of the effects of antibiotics on the microbiome, using
Classification and Regression Trees (CART) [Breiman et al., 1984], Hidden
Markov Models (HMMs) [Rabiner and Juang, 1986], Bayesian nonparametric HMMs
[Teh and Jordan, 2010, Fox et al., 2008], mixtures of Gaussian Processes (GPs)
[Rasmussen and Ghahramani, 2002], switching dynamical systems [Linderman et
al., 2016], and multiple changepoint detection [Fan and Mackey, 2015]. Along
the way, we summarize each method, their relevance to the microbiome, and
tradeoffs associated with using them. Ultimately, our goal is to describe types
of temporal or regime switching structure that can be incorporated into studies
of microbiome dynamics
Tech Report A Variational HEM Algorithm for Clustering Hidden Markov Models
The hidden Markov model (HMM) is a generative model that treats sequential
data under the assumption that each observation is conditioned on the state of
a discrete hidden variable that evolves in time as a Markov chain. In this
paper, we derive a novel algorithm to cluster HMMs through their probability
distributions. We propose a hierarchical EM algorithm that i) clusters a given
collection of HMMs into groups of HMMs that are similar, in terms of the
distributions they represent, and ii) characterizes each group by a "cluster
center", i.e., a novel HMM that is representative for the group. We present
several empirical studies that illustrate the benefits of the proposed
algorithm.Comment: 13 pages, 1 figur
Statistical Modeling in Continuous Speech Recognition (CSR)(Invited Talk)
Automatic continuous speech recognition (CSR) is sufficiently mature that a
variety of real world applications are now possible including large vocabulary
transcription and interactive spoken dialogues. This paper reviews the
evolution of the statistical modelling techniques which underlie current-day
systems, specifically hidden Markov models (HMMs) and N-grams. Starting from a
description of the speech signal and its parameterisation, the various
modelling assumptions and their consequences are discussed. It then describes
various techniques by which the effects of these assumptions can be mitigated.
Despite the progress that has been made, the limitations of current modelling
techniques are still evident. The paper therefore concludes with a brief review
of some of the more fundamental modelling work now in progress.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
Speech Recognition by Machine, A Review
This paper presents a brief survey on Automatic Speech Recognition and
discusses the major themes and advances made in the past 60 years of research,
so as to provide a technological perspective and an appreciation of the
fundamental progress that has been accomplished in this important area of
speech communication. After years of research and development the accuracy of
automatic speech recognition remains one of the important research challenges
(e.g., variations of the context, speakers, and environment).The design of
Speech Recognition system requires careful attentions to the following issues:
Definition of various types of speech classes, speech representation, feature
extraction techniques, speech classifiers, database and performance evaluation.
The problems that are existing in ASR and the various techniques to solve these
problems constructed by various research workers have been presented in a
chronological order. Hence authors hope that this work shall be a contribution
in the area of speech recognition. The objective of this review paper is to
summarize and compare some of the well known methods used in various stages of
speech recognition system and identify research topic and applications which
are at the forefront of this exciting and challenging field.Comment: 25 pages IEEE format, International Journal of Computer Science and
Information Security, IJCSIS December 2009, ISSN 1947 5500,
http://sites.google.com/site/ijcsis
Unsupervised Discovery of Linguistic Structure Including Two-level Acoustic Patterns Using Three Cascaded Stages of Iterative Optimization
Techniques for unsupervised discovery of acoustic patterns are getting
increasingly attractive, because huge quantities of speech data are becoming
available but manual annotations remain hard to acquire. In this paper, we
propose an approach for unsupervised discovery of linguistic structure for the
target spoken language given raw speech data. This linguistic structure
includes two-level (subword-like and word-like) acoustic patterns, the lexicon
of word-like patterns in terms of subword-like patterns and the N-gram language
model based on word-like patterns. All patterns, models, and parameters can be
automatically learned from the unlabelled speech corpus. This is achieved by an
initialization step followed by three cascaded stages for acoustic, linguistic,
and lexical iterative optimization. The lexicon of word-like patterns defines
allowed consecutive sequence of HMMs for subword-like patterns. In each
iteration, model training and decoding produces updated labels from which the
lexicon and HMMs can be further updated. In this way, model parameters and
decoded labels are respectively optimized in each iteration, and the knowledge
about the linguistic structure is learned gradually layer after layer. The
proposed approach was tested in preliminary experiments on a corpus of Mandarin
broadcast news, including a task of spoken term detection with performance
compared to a parallel test using models trained in a supervised way. Results
show that the proposed system not only yields reasonable performance on its
own, but is also complimentary to existing large vocabulary ASR systems.Comment: Accepted by ICASSP 201
Survey on Incremental Approaches for Network Anomaly Detection
As the communication industry has connected distant corners of the globe
using advances in network technology, intruders or attackers have also
increased attacks on networking infrastructure commensurately. System
administrators can attempt to prevent such attacks using intrusion detection
tools and systems. There are many commercially available signature-based
Intrusion Detection Systems (IDSs). However, most IDSs lack the capability to
detect novel or previously unknown attacks. A special type of IDSs, called
Anomaly Detection Systems, develop models based on normal system or network
behavior, with the goal of detecting both known and unknown attacks. Anomaly
detection systems face many problems including high rate of false alarm,
ability to work in online mode, and scalability. This paper presents a
selective survey of incremental approaches for detecting anomaly in normal
system or network traffic. The technological trends, open problems, and
challenges over anomaly detection using incremental approach are also
discussed.Comment: 14 pages, 1 figure, 11 tables referred journal publicatio
Unsupervised Discovery of Structured Acoustic Tokens with Applications to Spoken Term Detection
In this paper, we compare two paradigms for unsupervised discovery of
structured acoustic tokens directly from speech corpora without any human
annotation. The Multigranular Paradigm seeks to capture all available
information in the corpora with multiple sets of tokens for different model
granularities. The Hierarchical Paradigm attempts to jointly learn several
levels of signal representations in a hierarchical structure. The two paradigms
are unified within a theoretical framework in this paper. Query-by-Example
Spoken Term Detection (QbE-STD) experiments on the QUESST dataset of MediaEval
2015 verifies the competitiveness of the acoustic tokens. The Enhanced
Relevance Score (ERS) proposed in this work improves both paradigms for the
task of QbE-STD. We also list results on the ABX evaluation task of the Zero
Resource Challenge 2015 for comparison of the Paradigms
Maximum a Posteriori Adaptation of Network Parameters in Deep Models
We present a Bayesian approach to adapting parameters of a well-trained
context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to
improve automatic speech recognition performance. Given an abundance of DNN
parameters but with only a limited amount of data, the effectiveness of the
adapted DNN model can often be compromised. We formulate maximum a posteriori
(MAP) adaptation of parameters of a specially designed CD-DNN-HMM with an
augmented linear hidden networks connected to the output tied states, or
senones, and compare it to feature space MAP linear regression previously
proposed. Experimental evidences on the 20,000-word open vocabulary Wall Street
Journal task demonstrate the feasibility of the proposed framework. In
supervised adaptation, the proposed MAP adaptation approach provides more than
10% relative error reduction and consistently outperforms the conventional
transformation based methods. Furthermore, we present an initial attempt to
generate hierarchical priors to improve adaptation efficiency and effectiveness
with limited adaptation data by exploiting similarities among senones
- …