10,314 research outputs found

    Hierarchical relational models for document networks

    Full text link
    We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and estimation algorithms based on variational methods that take advantage of sparsity and scale with the number of links. We evaluate the predictive performance of the RTM for large networks of scientific abstracts, web documents, and geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    New methods for generating populations in Markov network based EDAs: Decimation strategies and model-based template recombination

    Get PDF
    Methods for generating a new population are a fundamental component of estimation of distribution algorithms (EDAs). They serve to transfer the information contained in the probabilistic model to the new generated population. In EDAs based on Markov networks, methods for generating new populations usually discard information contained in the model to gain in efficiency. Other methods like Gibbs sampling use information about all interactions in the model but are computationally very costly. In this paper we propose new methods for generating new solutions in EDAs based on Markov networks. We introduce approaches based on inference methods for computing the most probable configurations and model-based template recombination. We show that the application of different variants of inference methods can increase the EDAs’ convergence rate and reduce the number of function evaluations needed to find the optimum of binary and non-binary discrete functions

    Getting started in probabilistic graphical models

    Get PDF
    Probabilistic graphical models (PGMs) have become a popular tool for computational analysis of biological data in a variety of domains. But, what exactly are they and how do they work? How can we use PGMs to discover patterns that are biologically relevant? And to what extent can PGMs help us formulate new hypotheses that are testable at the bench? This note sketches out some answers and illustrates the main ideas behind the statistical approach to biological pattern discovery.Comment: 12 pages, 1 figur

    A Mathematical Analysis of the Long-run Behavior of Genetic Algorithms for Social Modeling

    Get PDF
    We present a mathematical analysis of the long-run behavior of genetic algorithms that are used for modeling social phenomena. The analysis relies on commonly used mathematical techniques in evolutionary game theory. Assuming a positive but infinitely small mutation rate, we derive results that can be used to calculate the exact long-run behavior of a genetic algorithm. Using these results, the need to rely on computer simulations can be avoided. We also show that if the mutation rate is infinitely small the crossover rate has no effect on the long-run behavior of a genetic algorithm. To demonstrate the usefulness of our mathematical analysis, we replicate a well-known study by Axelrod in which a genetic algorithm is used to model the evolution of strategies in iterated prisoner’s dilemmas. The theoretically predicted long-run behavior of the genetic algorithm turns out to be in perfect agreement with the long-run behavior observed in computer simulations. Also, in line with our theoretically informed expectations, computer simulations indicate that the crossover rate has virtually no long-run effect. Some general new insights into the behavior of genetic algorithms in the prisoner’s dilemma context are provided as well.genetic algorithm;economics;evolutionary game theory;long-run behavior;social modeling

    Statistical Inference for Partially Observed Markov Processes via the R Package pomp

    Get PDF
    Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including sequential Monte Carlo, iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and trajectory matching. In this paper, we demonstrate the application of these methodologies using some simple toy problems. We also illustrate the specification of more complex POMP models, using a nonlinear epidemiological model with a discrete population, seasonality, and extra-demographic stochasticity. We discuss the specification of user-defined models and the development of additional methods within the programming environment provided by pomp.Comment: In press at the Journal of Statistical Software. A version of this paper is provided at the pomp package website: http://kingaa.github.io/pom

    Markov Chain Analysis of Cumulative Step-size Adaptation on a Linear Constrained Problem

    Get PDF
    This paper analyzes a (1, λ\lambda)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.Comment: Evolutionary Computation, Massachusetts Institute of Technology Press (MIT Press): STM Titles, 201
    corecore