17,451 research outputs found

    Estimating ensemble flows on a hidden Markov chain

    Full text link
    We propose a new framework to estimate the evolution of an ensemble of indistinguishable agents on a hidden Markov chain using only aggregate output data. This work can be viewed as an extension of the recent developments in optimal mass transport and Schr\"odinger bridges to the finite state space hidden Markov chain setting. The flow of the ensemble is estimated by solving a maximum likelihood problem, which has a convex formulation at the infinite-particle limit, and we develop a fast numerical algorithm for it. We illustrate in two numerical examples how this framework can be used to track the flow of identical and indistinguishable dynamical systems.Comment: 8 pages, 4 figure

    Retrospective Higher-Order Markov Processes for User Trails

    Full text link
    Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains, in contrast, have too many parameters and suffer from overfitting the training data. Fitting these parameters with regularization and smoothing only offers mild improvements. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the number of parameters is linear in the order of the Markov chain and the model can be fit to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and business locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy

    Coalescent-based species delimitation in the sand lizards of the Liolaemus wiegmannii complex (Squamata: Liolaemidae)

    Get PDF
    Coalescent-based algorithms coupled with the access to genome-wide data have become powerful tools forassessing questions on recent or rapid diversification, as well as delineating species boundaries in the absence of reciprocal monophyly. In southern South America, the diversification of Liolaemus lizards during the Pleistocene is well documented and has been attributed to the climatic changes that characterized this recent period of time. Past climatic changes had harsh effects at extreme latitudes, including Patagonia, but habitat changes at intermediate latitudes of South America have also been recorded, including expansion of sand fields over northern Patagonia and Pampas). In this work, we apply a coalescent-based approach to study the diversification of the Liolaemus wiegmannii species complex, a morphologically conservative clade that inhabits sandy soils across northwest and south-central Argentina, and the south shores of Uruguay. Using four standard sequence markers (mitochondrial DNA and three nuclear loci) along with ddRADseq data we inferred species limits and a time calibrated species tree for the L. wiegmannii complex in order to evaluate the influence of Quaternary sand expansion/retraction cycles on diversification. We also evaluated the evolutionary independence of the recently described L. gardeli and inferred its phylogenetic position relative to L. wiegmannii. We find strong evidence for six allopatric candidate species within L. wiegmannii, which diversified during the Pleistocene. The Great Patagonian Glaciation (∌1 million years before present) likely split the species complex into two main groups: one composed of lineages associated with sub-Andean sedimentary formations, and the other mostly related to sand fields in the Pampas and northern Patagonia. We hypothesize that early speciation within L. wiegmannii was influenced by the expansion of sand dunes throughout central Argentina and Pampas. Finally, L. gardeli is supported as a distinct lineage nested within the L. wiegmannii complex.Fil: Villamil, JoaquĂ­n. Universidad de la RepĂșblica. Facultad de Ciencias; UruguayFil: Avila, Luciano Javier. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Centro Nacional PatagĂłnico. Instituto PatagĂłnico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Morando, Mariana. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - Centro Nacional PatagĂłnico. Instituto PatagĂłnico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Sites, Jack W.. University Brigham Young; Estados UnidosFil: LeachĂ©, Adam D.. University of Washington; Estados UnidosFil: Maneyro, RaĂșl. Universidad de la RepĂșblica. Facultad de Ciencias; UruguayFil: Camargo Bentaberry, Arley. Universidad de la RepĂșblica; Urugua

    Subspace estimation and prediction methods for hidden Markov models

    Full text link
    Hidden Markov models (HMMs) are probabilistic functions of finite Markov chains, or, put in other words, state space models with finite state space. In this paper, we examine subspace estimation methods for HMMs whose output lies a finite set as well. In particular, we study the geometric structure arising from the nonminimality of the linear state space representation of HMMs, and consistency of a subspace algorithm arising from a certain factorization of the singular value decomposition of the estimated linear prediction matrix. For this algorithm, we show that the estimates of the transition and emission probability matrices are consistent up to a similarity transformation, and that the mm-step linear predictor computed from the estimated system matrices is consistent, i.e., converges to the true optimal linear mm-step predictor.Comment: Published in at http://dx.doi.org/10.1214/09-AOS711 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Simple Approach to Maximum Intractable Likelihood Estimation

    Get PDF
    Approximate Bayesian Computation (ABC) can be viewed as an analytic approximation of an intractable likelihood coupled with an elementary simulation step. Such a view, combined with a suitable instrumental prior distribution permits maximum-likelihood (or maximum-a-posteriori) inference to be conducted, approximately, using essentially the same techniques. An elementary approach to this problem which simply obtains a nonparametric approximation of the likelihood surface which is then used as a smooth proxy for the likelihood in a subsequent maximisation step is developed here and the convergence of this class of algorithms is characterised theoretically. The use of non-sufficient summary statistics in this context is considered. Applying the proposed method to four problems demonstrates good performance. The proposed approach provides an alternative for approximating the maximum likelihood estimator (MLE) in complex scenarios
    • 

    corecore