7,279 research outputs found
A closed-form approach to Bayesian inference in tree-structured graphical models
We consider the inference of the structure of an undirected graphical model
in an exact Bayesian framework. More specifically we aim at achieving the
inference with close-form posteriors, avoiding any sampling step. This task
would be intractable without any restriction on the considered graphs, so we
limit our exploration to mixtures of spanning trees. We consider the inference
of the structure of an undirected graphical model in a Bayesian framework. To
avoid convergence issues and highly demanding Monte Carlo sampling, we focus on
exact inference. More specifically we aim at achieving the inference with
close-form posteriors, avoiding any sampling step. To this aim, we restrict the
set of considered graphs to mixtures of spanning trees. We investigate under
which conditions on the priors - on both tree structures and parameters - exact
Bayesian inference can be achieved. Under these conditions, we derive a fast an
exact algorithm to compute the posterior probability for an edge to belong to
{the tree model} using an algebraic result called the Matrix-Tree theorem. We
show that the assumption we have made does not prevent our approach to perform
well on synthetic and flow cytometry data
Modularity and the predictive mind
Modular approaches to the architecture of the mind claim that some mental mechanisms, such as sensory input processes, operate in special-purpose subsystems that are functionally independent from the rest of the mind. This assumption of modularity seems to be in tension with recent claims that the mind has a predictive architecture. Predictive approaches propose that both sensory processing and higher-level processing are part of the same Bayesian information-processing hierarchy, with no clear boundary between perception and cognition. Furthermore, it is not clear how any part of the predictive architecture could be functionally independent, given that each level of the hierarchy is influenced by the level above. Both the assumption of continuity across the predictive architecture and the seeming non-isolability of parts of the predictive architecture seem to be at odds with the modular approach. I explore and ultimately reject the predictive approach’s apparent commitments to continuity and non-isolation. I argue that predictive architectures can be modular architectures, and that we should in fact expect predictive architectures to exhibit some form of modularity
A network approach to topic models
One of the main computational and scientific challenges in the modern age is
to extract useful information from unstructured texts. Topic models are one
popular machine-learning approach which infers the latent topical structure of
a collection of documents. Despite their success --- in particular of its most
widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous
applications in sociology, history, and linguistics, topic models are known to
suffer from severe conceptual and practical problems, e.g. a lack of
justification for the Bayesian priors, discrepancies with statistical
properties of real texts, and the inability to properly choose the number of
topics. Here we obtain a fresh view on the problem of identifying topical
structures by relating it to the problem of finding communities in complex
networks. This is achieved by representing text corpora as bipartite networks
of documents and words. By adapting existing community-detection methods --
using a stochastic block model (SBM) with non-parametric priors -- we obtain a
more versatile and principled framework for topic modeling (e.g., it
automatically detects the number of topics and hierarchically clusters both the
words and documents). The analysis of artificial and real corpora demonstrates
that our SBM approach leads to better topic models than LDA in terms of
statistical model selection. More importantly, our work shows how to formally
relate methods from community detection and topic modeling, opening the
possibility of cross-fertilization between these two fields.Comment: 22 pages, 10 figures, code available at https://topsbm.github.io
Joint estimation of multiple related biological networks
Graphical models are widely used to make inferences concerning interplay in
multivariate systems. In many applications, data are collected from multiple
related but nonidentical units whose underlying networks may differ but are
likely to share features. Here we present a hierarchical Bayesian formulation
for joint estimation of multiple networks in this nonidentically distributed
setting. The approach is general: given a suitable class of graphical models,
it uses an exchangeability assumption on networks to provide a corresponding
joint formulation. Motivated by emerging experimental designs in molecular
biology, we focus on time-course data with interventions, using dynamic
Bayesian networks as the graphical models. We introduce a computationally
efficient, deterministic algorithm for exact joint inference in this setting.
We provide an upper bound on the gains that joint estimation offers relative to
separate estimation for each network and empirical results that support and
extend the theory, including an extensive simulation study and an application
to proteomic data from human cancer cell lines. Finally, we describe
approximations that are still more computationally efficient than the exact
algorithm and that also demonstrate good empirical performance.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS761 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Coverage Study of the CMSSM Based on ATLAS Sensitivity Using Fast Neural Networks Techniques
We assess the coverage properties of confidence and credible intervals on the
CMSSM parameter space inferred from a Bayesian posterior and the profile
likelihood based on an ATLAS sensitivity study. In order to make those
calculations feasible, we introduce a new method based on neural networks to
approximate the mapping between CMSSM parameters and weak-scale particle
masses. Our method reduces the computational effort needed to sample the CMSSM
parameter space by a factor of ~ 10^4 with respect to conventional techniques.
We find that both the Bayesian posterior and the profile likelihood intervals
can significantly over-cover and identify the origin of this effect to physical
boundaries in the parameter space. Finally, we point out that the effects
intrinsic to the statistical procedure are conflated with simplifications to
the likelihood functions from the experiments themselves.Comment: Further checks about accuracy of neural network approximation, fixed
typos, added refs. Main results unchanged. Matches version accepted by JHE
Algebraic Bayesian analysis of contingency tables with possibly zero-probability cells
In this paper we consider a Bayesian analysis of contingency tables allowing
for the possibility that cells may have probability zero. In this sense we
depart from standard log-linear modeling that implicitly assumes a positivity
constraint. Our approach leads us to consider mixture models for contingency
tables, where the components of the mixture, which we call model-instances,
have distinct support. We rely on ideas from polynomial algebra in order to
identify the various model instances. We also provide a method to assign prior
probabilities to each instance of the model, as well as describing methods for
constructing priors on the parameter space of each instance. We illustrate our
methodology through a table involving two structural zeros, as
well as a zero count. The results we obtain show that our analysis may lead to
conclusions that are substantively different from those that would obtain in a
standard framework, wherein the possibility of zero-probability cells is not
explicitly accounted for
- …