10,315 research outputs found
Hierarchical relational models for document networks
We develop the relational topic model (RTM), a hierarchical model of both
network structure and node attributes. We focus on document networks, where the
attributes of each document are its words, that is, discrete observations taken
from a fixed vocabulary. For each pair of documents, the RTM models their link
as a binary random variable that is conditioned on their contents. The model
can be used to summarize a network of documents, predict links between them,
and predict words within them. We derive efficient inference and estimation
algorithms based on variational methods that take advantage of sparsity and
scale with the number of links. We evaluate the predictive performance of the
RTM for large networks of scientific abstracts, web documents, and
geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
New methods for generating populations in Markov network based EDAs: Decimation strategies and model-based template recombination
Methods for generating a new population are a fundamental component of estimation of distribution algorithms (EDAs). They serve to transfer the information contained in the probabilistic model to the new generated population. In EDAs based on Markov networks, methods for generating new populations usually discard information contained in the model to gain in efficiency. Other methods like Gibbs sampling use information about all interactions in the model but are computationally very costly. In this paper we propose new methods for generating new solutions in EDAs based on Markov networks. We introduce approaches based on inference methods for computing the most probable configurations and model-based template recombination. We show that the application of different variants of inference methods can increase the EDAs’ convergence rate and reduce the number of function evaluations needed to find the optimum of binary and non-binary discrete functions
Getting started in probabilistic graphical models
Probabilistic graphical models (PGMs) have become a popular tool for
computational analysis of biological data in a variety of domains. But, what
exactly are they and how do they work? How can we use PGMs to discover patterns
that are biologically relevant? And to what extent can PGMs help us formulate
new hypotheses that are testable at the bench? This note sketches out some
answers and illustrates the main ideas behind the statistical approach to
biological pattern discovery.Comment: 12 pages, 1 figur
A Mathematical Analysis of the Long-run Behavior of Genetic Algorithms for Social Modeling
We present a mathematical analysis of the long-run behavior of genetic algorithms that are used for modeling social phenomena. The analysis relies on commonly used mathematical techniques in evolutionary game theory. Assuming a positive but infinitely small mutation rate, we derive results that can be used to calculate the exact long-run behavior of a genetic algorithm. Using these results, the need to rely on computer simulations can be avoided. We also show that if the mutation rate is infinitely small the crossover rate has no effect on the long-run behavior of a genetic algorithm. To demonstrate the usefulness of our mathematical analysis, we replicate a well-known study by Axelrod in which a genetic algorithm is used to model the evolution of strategies in iterated prisoner’s dilemmas. The theoretically predicted long-run behavior of the genetic algorithm turns out to be in perfect agreement with the long-run behavior observed in computer simulations. Also, in line with our theoretically informed expectations, computer simulations indicate that the crossover rate has virtually no long-run effect. Some general new insights into the behavior of genetic algorithms in the prisoner’s dilemma context are provided as well.genetic algorithm;economics;evolutionary game theory;long-run behavior;social modeling
Statistical Inference for Partially Observed Markov Processes via the R Package pomp
Partially observed Markov process (POMP) models, also known as hidden Markov
models or state space models, are ubiquitous tools for time series analysis.
The R package pomp provides a very flexible framework for Monte Carlo
statistical investigations using nonlinear, non-Gaussian POMP models. A range
of modern statistical methods for POMP models have been implemented in this
framework including sequential Monte Carlo, iterated filtering, particle Markov
chain Monte Carlo, approximate Bayesian computation, maximum synthetic
likelihood estimation, nonlinear forecasting, and trajectory matching. In this
paper, we demonstrate the application of these methodologies using some simple
toy problems. We also illustrate the specification of more complex POMP models,
using a nonlinear epidemiological model with a discrete population,
seasonality, and extra-demographic stochasticity. We discuss the specification
of user-defined models and the development of additional methods within the
programming environment provided by pomp.Comment: In press at the Journal of Statistical Software. A version of this
paper is provided at the pomp package website: http://kingaa.github.io/pom
Markov Chain Analysis of Cumulative Step-size Adaptation on a Linear Constrained Problem
This paper analyzes a (1, )-Evolution Strategy, a randomized
comparison-based adaptive search algorithm, optimizing a linear function with a
linear constraint. The algorithm uses resampling to handle the constraint. Two
cases are investigated: first the case where the step-size is constant, and
second the case where the step-size is adapted using cumulative step-size
adaptation. We exhibit for each case a Markov chain describing the behaviour of
the algorithm. Stability of the chain implies, by applying a law of large
numbers, either convergence or divergence of the algorithm. Divergence is the
desired behaviour. In the constant step-size case, we show stability of the
Markov chain and prove the divergence of the algorithm. In the cumulative
step-size adaptation case, we prove stability of the Markov chain in the
simplified case where the cumulation parameter equals 1, and discuss steps to
obtain similar results for the full (default) algorithm where the cumulation
parameter is smaller than 1. The stability of the Markov chain allows us to
deduce geometric divergence or convergence , depending on the dimension,
constraint angle, population size and damping parameter, at a rate that we
estimate. Our results complement previous studies where stability was assumed.Comment: Evolutionary Computation, Massachusetts Institute of Technology Press
(MIT Press): STM Titles, 201
- …