17,451 research outputs found
Estimating ensemble flows on a hidden Markov chain
We propose a new framework to estimate the evolution of an ensemble of
indistinguishable agents on a hidden Markov chain using only aggregate output
data. This work can be viewed as an extension of the recent developments in
optimal mass transport and Schr\"odinger bridges to the finite state space
hidden Markov chain setting. The flow of the ensemble is estimated by solving a
maximum likelihood problem, which has a convex formulation at the
infinite-particle limit, and we develop a fast numerical algorithm for it. We
illustrate in two numerical examples how this framework can be used to track
the flow of identical and indistinguishable dynamical systems.Comment: 8 pages, 4 figure
Retrospective Higher-Order Markov Processes for User Trails
Users form information trails as they browse the web, checkin with a
geolocation, rate items, or consume media. A common problem is to predict what
a user might do next for the purposes of guidance, recommendation, or
prefetching. First-order and higher-order Markov chains have been widely used
methods to study such sequences of data. First-order Markov chains are easy to
estimate, but lack accuracy when history matters. Higher-order Markov chains,
in contrast, have too many parameters and suffer from overfitting the training
data. Fitting these parameters with regularization and smoothing only offers
mild improvements. In this paper we propose the retrospective higher-order
Markov process (RHOMP) as a low-parameter model for such sequences. This model
is a special case of a higher-order Markov chain where the transitions depend
retrospectively on a single history state instead of an arbitrary combination
of history states. There are two immediate computational advantages: the number
of parameters is linear in the order of the Markov chain and the model can be
fit to large state spaces. Furthermore, by providing a specific structure to
the higher-order chain, RHOMPs improve the model accuracy by efficiently
utilizing history states without risks of overfitting the data. We demonstrate
how to estimate a RHOMP from data and we demonstrate the effectiveness of our
method on various real application datasets spanning geolocation data, review
sequences, and business locations. The RHOMP model uniformly outperforms
higher-order Markov chains, Kneser-Ney regularization, and tensor
factorizations in terms of prediction accuracy
Coalescent-based species delimitation in the sand lizards of the Liolaemus wiegmannii complex (Squamata: Liolaemidae)
Coalescent-based algorithms coupled with the access to genome-wide data have become powerful tools forassessing questions on recent or rapid diversification, as well as delineating species boundaries in the absence of reciprocal monophyly. In southern South America, the diversification of Liolaemus lizards during the Pleistocene is well documented and has been attributed to the climatic changes that characterized this recent period of time. Past climatic changes had harsh effects at extreme latitudes, including Patagonia, but habitat changes at intermediate latitudes of South America have also been recorded, including expansion of sand fields over northern Patagonia and Pampas). In this work, we apply a coalescent-based approach to study the diversification of the Liolaemus wiegmannii species complex, a morphologically conservative clade that inhabits sandy soils across northwest and south-central Argentina, and the south shores of Uruguay. Using four standard sequence markers (mitochondrial DNA and three nuclear loci) along with ddRADseq data we inferred species limits and a time calibrated species tree for the L. wiegmannii complex in order to evaluate the influence of Quaternary sand expansion/retraction cycles on diversification. We also evaluated the evolutionary independence of the recently described L. gardeli and inferred its phylogenetic position relative to L. wiegmannii. We find strong evidence for six allopatric candidate species within L. wiegmannii, which diversified during the Pleistocene. The Great Patagonian Glaciation (âŒ1 million years before present) likely split the species complex into two main groups: one composed of lineages associated with sub-Andean sedimentary formations, and the other mostly related to sand fields in the Pampas and northern Patagonia. We hypothesize that early speciation within L. wiegmannii was influenced by the expansion of sand dunes throughout central Argentina and Pampas. Finally, L. gardeli is supported as a distinct lineage nested within the L. wiegmannii complex.Fil: Villamil, JoaquĂn. Universidad de la RepĂșblica. Facultad de Ciencias; UruguayFil: Avila, Luciano Javier. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - Centro Nacional PatagĂłnico. Instituto PatagĂłnico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Morando, Mariana. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - Centro Nacional PatagĂłnico. Instituto PatagĂłnico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Sites, Jack W.. University Brigham Young; Estados UnidosFil: LeachĂ©, Adam D.. University of Washington; Estados UnidosFil: Maneyro, RaĂșl. Universidad de la RepĂșblica. Facultad de Ciencias; UruguayFil: Camargo Bentaberry, Arley. Universidad de la RepĂșblica; Urugua
Subspace estimation and prediction methods for hidden Markov models
Hidden Markov models (HMMs) are probabilistic functions of finite Markov
chains, or, put in other words, state space models with finite state space. In
this paper, we examine subspace estimation methods for HMMs whose output lies a
finite set as well. In particular, we study the geometric structure arising
from the nonminimality of the linear state space representation of HMMs, and
consistency of a subspace algorithm arising from a certain factorization of the
singular value decomposition of the estimated linear prediction matrix. For
this algorithm, we show that the estimates of the transition and emission
probability matrices are consistent up to a similarity transformation, and that
the -step linear predictor computed from the estimated system matrices is
consistent, i.e., converges to the true optimal linear -step predictor.Comment: Published in at http://dx.doi.org/10.1214/09-AOS711 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Simple Approach to Maximum Intractable Likelihood Estimation
Approximate Bayesian Computation (ABC) can be viewed as an analytic
approximation of an intractable likelihood coupled with an elementary
simulation step. Such a view, combined with a suitable instrumental prior
distribution permits maximum-likelihood (or maximum-a-posteriori) inference to
be conducted, approximately, using essentially the same techniques. An
elementary approach to this problem which simply obtains a nonparametric
approximation of the likelihood surface which is then used as a smooth proxy
for the likelihood in a subsequent maximisation step is developed here and the
convergence of this class of algorithms is characterised theoretically. The use
of non-sufficient summary statistics in this context is considered. Applying
the proposed method to four problems demonstrates good performance. The
proposed approach provides an alternative for approximating the maximum
likelihood estimator (MLE) in complex scenarios
- âŠ