3,573 research outputs found
Parametric Modelling of Multivariate Count Data Using Probabilistic Graphical Models
Multivariate count data are defined as the number of items of different
categories issued from sampling within a population, which individuals are
grouped into categories. The analysis of multivariate count data is a recurrent
and crucial issue in numerous modelling problems, particularly in the fields of
biology and ecology (where the data can represent, for example, children counts
associated with multitype branching processes), sociology and econometrics. We
focus on I) Identifying categories that appear simultaneously, or on the
contrary that are mutually exclusive. This is achieved by identifying
conditional independence relationships between the variables; II)Building
parsimonious parametric models consistent with these relationships; III)
Characterising and testing the effects of covariates on the joint distribution
of the counts. To achieve these goals, we propose an approach based on
graphical probabilistic models, and more specifically partially directed
acyclic graphs
Partition MCMC for inference on acyclic digraphs
Acyclic digraphs are the underlying representation of Bayesian networks, a
widely used class of probabilistic graphical models. Learning the underlying
graph from data is a way of gaining insights about the structural properties of
a domain. Structure learning forms one of the inference challenges of
statistical graphical models.
MCMC methods, notably structure MCMC, to sample graphs from the posterior
distribution given the data are probably the only viable option for Bayesian
model averaging. Score modularity and restrictions on the number of parents of
each node allow the graphs to be grouped into larger collections, which can be
scored as a whole to improve the chain's convergence. Current examples of
algorithms taking advantage of grouping are the biased order MCMC, which acts
on the alternative space of permuted triangular matrices, and non ergodic edge
reversal moves.
Here we propose a novel algorithm, which employs the underlying combinatorial
structure of DAGs to define a new grouping. As a result convergence is improved
compared to structure MCMC, while still retaining the property of producing an
unbiased sample. Finally the method can be combined with edge reversal moves to
improve the sampler further.Comment: Revised version. 34 pages, 16 figures. R code available at
https://github.com/annlia/partitionMCM
Learning to Address Health Inequality in the United States with a Bayesian Decision Network
Life-expectancy is a complex outcome driven by genetic, socio-demographic,
environmental and geographic factors. Increasing socio-economic and health
disparities in the United States are propagating the longevity-gap, making it a
cause for concern. Earlier studies have probed individual factors but an
integrated picture to reveal quantifiable actions has been missing. There is a
growing concern about a further widening of healthcare inequality caused by
Artificial Intelligence (AI) due to differential access to AI-driven services.
Hence, it is imperative to explore and exploit the potential of AI for
illuminating biases and enabling transparent policy decisions for positive
social and health impact. In this work, we reveal actionable interventions for
decreasing the longevity-gap in the United States by analyzing a County-level
data resource containing healthcare, socio-economic, behavioral, education and
demographic features. We learn an ensemble-averaged structure, draw inferences
using the joint probability distribution and extend it to a Bayesian Decision
Network for identifying policy actions. We draw quantitative estimates for the
impact of diversity, preventive-care quality and stable-families within the
unified framework of our decision network. Finally, we make this analysis and
dashboard available as an interactive web-application for enabling users and
policy-makers to validate our reported findings and to explore the impact of
ones beyond reported in this work.Comment: 8 pages, 4 figures, 1 table (excluding the supplementary material),
accepted for publication in AAAI 201
On the Prior and Posterior Distributions Used in Graphical Modelling
Graphical model learning and inference are often performed using Bayesian
techniques. In particular, learning is usually performed in two separate steps.
First, the graph structure is learned from the data; then the parameters of the
model are estimated conditional on that graph structure. While the probability
distributions involved in this second step have been studied in depth, the ones
used in the first step have not been explored in as much detail.
In this paper, we will study the prior and posterior distributions defined
over the space of the graph structures for the purpose of learning the
structure of a graphical model. In particular, we will provide a
characterisation of the behaviour of those distributions as a function of the
possible edges of the graph. We will then use the properties resulting from
this characterisation to define measures of structural variability for both
Bayesian and Markov networks, and we will point out some of their possible
applications.Comment: 28 pages, 6 figure
Probabilistic Graphical Model Representation in Phylogenetics
Recent years have seen a rapid expansion of the model space explored in
statistical phylogenetics, emphasizing the need for new approaches to
statistical model representation and software development. Clear communication
and representation of the chosen model is crucial for: (1) reproducibility of
an analysis, (2) model development and (3) software design. Moreover, a
unified, clear and understandable framework for model representation lowers the
barrier for beginners and non-specialists to grasp complex phylogenetic models,
including their assumptions and parameter/variable dependencies.
Graphical modeling is a unifying framework that has gained in popularity in
the statistical literature in recent years. The core idea is to break complex
models into conditionally independent distributions. The strength lies in the
comprehensibility, flexibility, and adaptability of this formalism, and the
large body of computational work based on it. Graphical models are well-suited
to teach statistical models, to facilitate communication among phylogeneticists
and in the development of generic software for simulation and statistical
inference.
Here, we provide an introduction to graphical models for phylogeneticists and
extend the standard graphical model representation to the realm of
phylogenetics. We introduce a new graphical model component, tree plates, to
capture the changing structure of the subgraph corresponding to a phylogenetic
tree. We describe a range of phylogenetic models using the graphical model
framework and introduce modules to simplify the representation of standard
components in large and complex models. Phylogenetic model graphs can be
readily used in simulation, maximum likelihood inference, and Bayesian
inference using, for example, Metropolis-Hastings or Gibbs sampling of the
posterior distribution
Tractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference
Exchangeability is a central notion in statistics and probability theory. The
assumption that an infinite sequence of data points is exchangeable is at the
core of Bayesian statistics. However, finite exchangeability as a statistical
property that renders probabilistic inference tractable is less
well-understood. We develop a theory of finite exchangeability and its relation
to tractable probabilistic inference. The theory is complementary to that of
independence and conditional independence. We show that tractable inference in
probabilistic models with high treewidth and millions of variables can be
understood using the notion of finite (partial) exchangeability. We also show
that existing lifted inference algorithms implicitly utilize a combination of
conditional independence and partial exchangeability.Comment: In Proceedings of the 28th AAAI Conference on Artificial Intelligenc
Structure Learning of Partitioned Markov Networks
We learn the structure of a Markov Network between two groups of random
variables from joint observations. Since modelling and learning the full MN
structure may be hard, learning the links between two groups directly may be a
preferable option. We introduce a novel concept called the \emph{partitioned
ratio} whose factorization directly associates with the Markovian properties of
random variables across two groups. A simple one-shot convex optimization
procedure is proposed for learning the \emph{sparse} factorizations of the
partitioned ratio and it is theoretically guaranteed to recover the correct
inter-group structure under mild conditions. The performance of the proposed
method is experimentally compared with the state of the art MN structure
learning methods using ROC curves. Real applications on analyzing
bipartisanship in US congress and pairwise DNA/time-series alignments are also
reported.Comment: Camera Ready for ICML 2016. Fixed some minor typo
Who Learns Better Bayesian Network Structures: Accuracy and Speed of Structure Learning Algorithms
Three classes of algorithms to learn the structure of Bayesian networks from
data are common in the literature: constraint-based algorithms, which use
conditional independence tests to learn the dependence structure of the data;
score-based algorithms, which use goodness-of-fit scores as objective functions
to maximise; and hybrid algorithms that combine both approaches.
Constraint-based and score-based algorithms have been shown to learn the same
structures when conditional independence and goodness of fit are both assessed
using entropy and the topological ordering of the network is known (Cowell,
2001).
In this paper, we investigate how these three classes of algorithms perform
outside the assumptions above in terms of speed and accuracy of network
reconstruction for both discrete and Gaussian Bayesian networks. We approach
this question by recognising that structure learning is defined by the
combination of a statistical criterion and an algorithm that determines how the
criterion is applied to the data. Removing the confounding effect of different
choices for the statistical criterion, we find using both simulated and
real-world complex data that constraint-based algorithms are often less
accurate than score-based algorithms, but are seldom faster (even at large
sample sizes); and that hybrid algorithms are neither faster nor more accurate
than constraint-based algorithms. This suggests that commonly held beliefs on
structure learning in the literature are strongly influenced by the choice of
particular statistical criteria rather than just by the properties of the
algorithms themselves.Comment: 27 pages, 8 figure
- …