189 research outputs found
Learning Markov networks with context-specific independences
Learning the Markov network structure from data is a problem that has
received considerable attention in machine learning, and in many other
application fields. This work focuses on a particular approach for this purpose
called independence-based learning. Such approach guarantees the learning of
the correct structure efficiently, whenever data is sufficient for representing
the underlying distribution. However, an important issue of such approach is
that the learned structures are encoded in an undirected graph. The problem
with graphs is that they cannot encode some types of independence relations,
such as the context-specific independences. They are a particular case of
conditional independences that is true only for a certain assignment of its
conditioning set, in contrast to conditional independences that must hold for
all its assignments. In this work we present CSPC, an independence-based
algorithm for learning structures that encode context-specific independences,
and encoding them in a log-linear model, instead of a graph. The central idea
of CSPC is combining the theoretical guarantees provided by the
independence-based approach with the benefits of representing complex
structures by using features in a log-linear model. We present experiments in a
synthetic case, showing that CSPC is more accurate than the state-of-the-art IB
algorithms when the underlying distribution contains CSIs.Comment: 8 pages, 6 figure
Sparse Nested Markov models with Log-linear Parameters
Hidden variables are ubiquitous in practical data analysis, and therefore
modeling marginal densities and doing inference with the resulting models is an
important problem in statistics, machine learning, and causal inference.
Recently, a new type of graphical model, called the nested Markov model, was
developed which captures equality constraints found in marginals of directed
acyclic graph (DAG) models. Some of these constraints, such as the so called
`Verma constraint', strictly generalize conditional independence. To make
modeling and inference with nested Markov models practical, it is necessary to
limit the number of parameters in the model, while still correctly capturing
the constraints in the marginal of a DAG model. Placing such limits is similar
in spirit to sparsity methods for undirected graphical models, and regression
models. In this paper, we give a log-linear parameterization which allows
sparse modeling with nested Markov models. We illustrate the advantages of this
parameterization with a simulation study.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
The Grow-Shrink strategy for learning Markov network structures constrained by context-specific independences
Markov networks are models for compactly representing complex probability
distributions. They are composed by a structure and a set of numerical weights.
The structure qualitatively describes independences in the distribution, which
can be exploited to factorize the distribution into a set of compact functions.
A key application for learning structures from data is to automatically
discover knowledge. In practice, structure learning algorithms focused on
"knowledge discovery" present a limitation: they use a coarse-grained
representation of the structure. As a result, this representation cannot
describe context-specific independences. Very recently, an algorithm called
CSPC was designed to overcome this limitation, but it has a high computational
complexity. This work tries to mitigate this downside presenting CSGS, an
algorithm that uses the Grow-Shrink strategy for reducing unnecessary
computations. On an empirical evaluation, the structures learned by CSGS
achieve competitive accuracies and lower computational complexity with respect
to those obtained by CSPC.Comment: 12 pages, and 8 figures. This works was presented in IBERAMIA 201
Concepts and a case study for a flexible class of graphical Markov models
With graphical Markov models, one can investigate complex dependences,
summarize some results of statistical analyses with graphs and use these graphs
to understand implications of well-fitting models. The models have a rich
history and form an area that has been intensively studied and developed in
recent years. We give a brief review of the main concepts and describe in more
detail a flexible subclass of models, called traceable regressions. These are
sequences of joint response regressions for which regression graphs permit one
to trace and thereby understand pathways of dependence. We use these methods to
reanalyze and interpret data from a prospective study of child development, now
known as the Mannheim Study of Children at Risk. The two related primary
features concern cognitive and motor development, at the age of 4.5 and 8 years
of a child. Deficits in these features form a sequence of joint responses.
Several possible risks are assessed at birth of the child and when the child
reached age 3 months and 2 years.Comment: 21 pages, 7 figures, 7 tables; invited, refereed chapter in a boo
The IBMAP approach for Markov networks structure learning
In this work we consider the problem of learning the structure of Markov
networks from data. We present an approach for tackling this problem called
IBMAP, together with an efficient instantiation of the approach: the IBMAP-HC
algorithm, designed for avoiding important limitations of existing
independence-based algorithms. These algorithms proceed by performing
statistical independence tests on data, trusting completely the outcome of each
test. In practice tests may be incorrect, resulting in potential cascading
errors and the consequent reduction in the quality of the structures learned.
IBMAP contemplates this uncertainty in the outcome of the tests through a
probabilistic maximum-a-posteriori approach. The approach is instantiated in
the IBMAP-HC algorithm, a structure selection strategy that performs a
polynomial heuristic local search in the space of possible structures. We
present an extensive empirical evaluation on synthetic and real data, showing
that our algorithm outperforms significantly the current independence-based
algorithms, in terms of data efficiency and quality of learned structures, with
equivalent computational complexities. We also show the performance of IBMAP-HC
in a real-world application of knowledge discovery: EDAs, which are
evolutionary algorithms that use structure learning on each generation for
modeling the distribution of populations. The experiments show that when
IBMAP-HC is used to learn the structure, EDAs improve the convergence to the
optimum
Bayesian optimization of the PC algorithm for learning Gaussian Bayesian networks
The PC algorithm is a popular method for learning the structure of Gaussian
Bayesian networks. It carries out statistical tests to determine absent edges
in the network. It is hence governed by two parameters: (i) The type of test,
and (ii) its significance level. These parameters are usually set to values
recommended by an expert. Nevertheless, such an approach can suffer from human
bias, leading to suboptimal reconstruction results. In this paper we consider a
more principled approach for choosing these parameters in an automatic way. For
this we optimize a reconstruction score evaluated on a set of different
Gaussian Bayesian networks. This objective is expensive to evaluate and lacks a
closed-form expression, which means that Bayesian optimization (BO) is a
natural choice. BO methods use a model to guide the search and are hence able
to exploit smoothness properties of the objective surface. We show that the
parameters found by a BO method outperform those found by a random search
strategy and the expert recommendation. Importantly, we have found that an
often overlooked statistical test provides the best over-all reconstruction
results
- …