1,559 research outputs found
The EM Algorithm and the Rise of Computational Biology
In the past decade computational biology has grown from a cottage industry
with a handful of researchers to an attractive interdisciplinary field,
catching the attention and imagination of many quantitatively-minded
scientists. Of interest to us is the key role played by the EM algorithm during
this transformation. We survey the use of the EM algorithm in a few important
computational biology problems surrounding the "central dogma"; of molecular
biology: from DNA to RNA and then to proteins. Topics of this article include
sequence motif discovery, protein sequence alignment, population genetics,
evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Prosody-Based Automatic Segmentation of Speech into Sentences and Topics
A crucial step in processing speech audio data for information extraction,
topic detection, or browsing/playback is to segment the input into sentence and
topic units. Speech segmentation is challenging, since the cues typically
present for segmenting text (headers, paragraphs, punctuation) are absent in
spoken language. We investigate the use of prosody (information gleaned from
the timing and melody of speech) for these tasks. Using decision tree and
hidden Markov modeling techniques, we combine prosodic cues with word-based
approaches, and evaluate performance on two speech corpora, Broadcast News and
Switchboard. Results show that the prosodic model alone performs on par with,
or better than, word-based statistical language models -- for both true and
automatically recognized words in news speech. The prosodic model achieves
comparable performance with significantly less training data, and requires no
hand-labeling of prosodic events. Across tasks and corpora, we obtain a
significant improvement over word-only models using a probabilistic combination
of prosodic and lexical information. Inspection reveals that the prosodic
models capture language-independent boundary indicators described in the
literature. Finally, cue usage is task and corpus dependent. For example, pause
and pitch features are highly informative for segmenting news speech, whereas
pause, duration and word-based cues dominate for natural conversation.Comment: 30 pages, 9 figures. To appear in Speech Communication 32(1-2),
Special Issue on Accessing Information in Spoken Audio, September 200
The performance of modularity maximization in practical contexts
Although widely used in practice, the behavior and accuracy of the popular
module identification technique called modularity maximization is not well
understood in practical contexts. Here, we present a broad characterization of
its performance in such situations. First, we revisit and clarify the
resolution limit phenomenon for modularity maximization. Second, we show that
the modularity function Q exhibits extreme degeneracies: it typically admits an
exponential number of distinct high-scoring solutions and typically lacks a
clear global maximum. Third, we derive the limiting behavior of the maximum
modularity Q_max for one model of infinitely modular networks, showing that it
depends strongly both on the size of the network and on the number of modules
it contains. Finally, using three real-world metabolic networks as examples, we
show that the degenerate solutions can fundamentally disagree on many, but not
all, partition properties such as the composition of the largest modules and
the distribution of module sizes. These results imply that the output of any
modularity maximization procedure should be interpreted cautiously in
scientific contexts. They also explain why many heuristics are often successful
at finding high-scoring partitions in practice and why different heuristics can
disagree on the modular structure of the same network. We conclude by
discussing avenues for mitigating some of these behaviors, such as combining
information from many degenerate solutions or using generative models.Comment: 20 pages, 14 figures, 6 appendices; code available at
http://www.santafe.edu/~aaronc/modularity
Factor Analysed Hidden Markov Models for Conditionally Heteroscedastic Financial Time Series
In this article we develop a new approach within the framework of asset pricing models that incorporates two key features of the latent volatility: co-movement among conditionally heteroscedastic financial returns and switching between different unobservable regimes. By combining latent factor models with hidden Markov chain models (HMM) we derive a dynamical local model for segmentation and prediction of multivariate conditionally heteroscedastic financial time series. We concentrate, more precisely on situations where the factor variances are modeled by univariate GQARCH processes. The intuition behind our approach is the use a piece-wise multivariate and linear process - which we can also regard as a mixed-state dynamic linear system - for modeling the regime switches. In particular, we supposed that the observed series can be modeled using a time varying parameter model with the assumption that the evolution of these parameters is governed by a first-order hidden Markov process. The EM algorithm that we have developed for the maximum likelihood estimation, is based on a quasi-optimal switching Kalman filter approach combined with a Viterbi approximation which yield inferences about the unobservable path of the common factors, their variances and the latent variable of the state process. Extensive Monte Carlo simulations and preliminary experiments obtained with daily foreign exchange rate returns of eight currencies show promising results
Discriminative, generative, and imitative learning
Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2002.Includes bibliographical references (leaves 201-212).I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specific knowledge in terms of structure and parameter priors over the joint space of variables. Bayesian networks and Bayesian statistics provide a rich and flexible language for specifying this knowledge and subsequently refining it with data and observations. The final result is a distribution that is a good generator of novel exemplars. Conversely, discriminative algorithms adjust a possibly non-distributional model to data optimizing for a specific task, such as classification or prediction. This typically leads to superior performance yet compromises the flexibility of generative modeling. I present Maximum Entropy Discrimination (MED) as a framework to combine both discriminative estimation and generative probability densities. Calculations involve distributions over parameters, margins, and priors and are provably and uniquely solvable for the exponential family. Extensions include regression, feature selection, and transduction. SVMs are also naturally subsumed and can be augmented with, for example, feature selection, to obtain substantial improvements. To extend to mixtures of exponential families, I derive a discriminative variant of the Expectation-Maximization (EM) algorithm for latent discriminative learning (or latent MED).(cont.) While EM and Jensen lower bound log-likelihood, a dual upper bound is made possible via a novel reverse-Jensen inequality. The variational upper bound on latent log-likelihood has the same form as EM bounds, is computable efficiently and is globally guaranteed. It permits powerful discriminative learning with the wide range of contemporary probabilistic mixture models (mixtures of Gaussians, mixtures of multinomials and hidden Markov models). We provide empirical results on standardized data sets that demonstrate the viability of the hybrid discriminative-generative approaches of MED and reverse-Jensen bounds over state of the art discriminative techniques or generative approaches. Subsequently, imitative learning is presented as another variation on generative modeling which also learns from exemplars from an observed data source. However, the distinction is that the generative model is an agent that is interacting in a much more complex surrounding external world. It is not efficient to model the aggregate space in a generative setting. I demonstrate that imitative learning (under appropriate conditions) can be adequately addressed as a discriminative prediction task which outperforms the usual generative approach. This discriminative-imitative learning approach is applied with a generative perceptual system to synthesize a real-time agent that learns to engage in social interactive behavior.by Tony Jebara.Ph.D
- …