276 research outputs found
Sparse Nested Markov models with Log-linear Parameters
Hidden variables are ubiquitous in practical data analysis, and therefore
modeling marginal densities and doing inference with the resulting models is an
important problem in statistics, machine learning, and causal inference.
Recently, a new type of graphical model, called the nested Markov model, was
developed which captures equality constraints found in marginals of directed
acyclic graph (DAG) models. Some of these constraints, such as the so called
`Verma constraint', strictly generalize conditional independence. To make
modeling and inference with nested Markov models practical, it is necessary to
limit the number of parameters in the model, while still correctly capturing
the constraints in the marginal of a DAG model. Placing such limits is similar
in spirit to sparsity methods for undirected graphical models, and regression
models. In this paper, we give a log-linear parameterization which allows
sparse modeling with nested Markov models. We illustrate the advantages of this
parameterization with a simulation study.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Graphical methods for inequality constraints in marginalized DAGs
We present a graphical approach to deriving inequality constraints for
directed acyclic graph (DAG) models, where some variables are unobserved. In
particular we show that the observed distribution of a discrete model is always
restricted if any two observed variables are neither adjacent in the graph, nor
share a latent parent; this generalizes the well known instrumental inequality.
The method also provides inequalities on interventional distributions, which
can be used to bound causal effects. All these constraints are characterized in
terms of a new graphical separation criterion, providing an easy and intuitive
method for their derivation.Comment: A final version will appear in the proceedings of the 22nd Workshop
on Machine Learning and Signal Processing, 201
Graphical Markov models, unifying results and their interpretation
Graphical Markov models combine conditional independence constraints with
graphical representations of stepwise data generating processes.The models
started to be formulated about 40 years ago and vigorous development is
ongoing. Longitudinal observational studies as well as intervention studies are
best modeled via a subclass called regression graph models and, especially
traceable regressions. Regression graphs include two types of undirected graph
and directed acyclic graphs in ordered sequences of joint responses. Response
components may correspond to discrete or continuous random variables and may
depend exclusively on variables which have been generated earlier. These
aspects are essential when causal hypothesis are the motivation for the
planning of empirical studies.
To turn the graphs into useful tools for tracing developmental pathways and
for predicting structure in alternative models, the generated distributions
have to mimic some properties of joint Gaussian distributions. Here, relevant
results concerning these aspects are spelled out and illustrated by examples.
With regression graph models, it becomes feasible, for the first time, to
derive structural effects of (1) ignoring some of the variables, of (2)
selecting subpopulations via fixed levels of some other variables or of (3)
changing the order in which the variables might get generated. Thus, the most
important future applications of these models will aim at the best possible
integration of knowledge from related studies.Comment: 34 Pages, 11 figures, 1 tabl
The Inflation Technique for Causal Inference with Latent Variables
The problem of causal inference is to determine if a given probability
distribution on observed variables is compatible with some causal structure.
The difficult case is when the causal structure includes latent variables. We
here introduce the for tackling this problem. An
inflation of a causal structure is a new causal structure that can contain
multiple copies of each of the original variables, but where the ancestry of
each copy mirrors that of the original. To every distribution of the observed
variables that is compatible with the original causal structure, we assign a
family of marginal distributions on certain subsets of the copies that are
compatible with the inflated causal structure. It follows that compatibility
constraints for the inflation can be translated into compatibility constraints
for the original causal structure. Even if the constraints at the level of
inflation are weak, such as observable statistical independences implied by
disjoint causal ancestry, the translated constraints can be strong. We apply
this method to derive new inequalities whose violation by a distribution
witnesses that distribution's incompatibility with the causal structure (of
which Bell inequalities and Pearl's instrumental inequality are prominent
examples). We describe an algorithm for deriving all such inequalities for the
original causal structure that follow from ancestral independences in the
inflation. For three observed binary variables with pairwise common causes, it
yields inequalities that are stronger in at least some aspects than those
obtainable by existing methods. We also describe an algorithm that derives a
weaker set of inequalities but is more efficient. Finally, we discuss which
inflations are such that the inequalities one obtains from them remain valid
even for quantum (and post-quantum) generalizations of the notion of a causal
model.Comment: Minor final corrections, updated to match the published version as
closely as possibl
Nested Markov Properties for Acyclic Directed Mixed Graphs
Directed acyclic graph (DAG) models may be characterized in at least four
different ways: via a factorization, the d-separation criterion, the
moralization criterion, and the local Markov property. As pointed out by Robins
(1986, 1999), Verma and Pearl (1990), and Tian and Pearl (2002b), marginals of
DAG models also imply equality constraints that are not conditional
independences. The well-known `Verma constraint' is an example. Constraints of
this type were used for testing edges (Shpitser et al., 2009), and an efficient
marginalization scheme via variable elimination (Shpitser et al., 2011).
We show that equality constraints like the `Verma constraint' can be viewed
as conditional independences in kernel objects obtained from joint
distributions via a fixing operation that generalizes conditioning and
marginalization. We use these constraints to define, via Markov properties and
a factorization, a graphical model associated with acyclic directed mixed
graphs (ADMGs). We show that marginal distributions of DAG models lie in this
model, prove that a characterization of these constraints given in (Tian and
Pearl, 2002b) gives an alternative definition of the model, and finally show
that the fixing operation we used to define the model can be used to give a
particularly simple characterization of identifiable causal effects in hidden
variable graphical causal models.Comment: 67 pages (not including appendix and references), 8 figure
Smooth, identifiable supermodels of discrete DAG models with latent variables
We provide a parameterization of the discrete nested Markov model, which is a
supermodel that approximates DAG models (Bayesian network models) with latent
variables. Such models are widely used in causal inference and machine
learning. We explicitly evaluate their dimension, show that they are curved
exponential families of distributions, and fit them to data. The
parameterization avoids the irregularities and unidentifiability of latent
variable models. The parameters used are all fully identifiable and
causally-interpretable quantities.Comment: 30 page
Concepts and a case study for a flexible class of graphical Markov models
With graphical Markov models, one can investigate complex dependences,
summarize some results of statistical analyses with graphs and use these graphs
to understand implications of well-fitting models. The models have a rich
history and form an area that has been intensively studied and developed in
recent years. We give a brief review of the main concepts and describe in more
detail a flexible subclass of models, called traceable regressions. These are
sequences of joint response regressions for which regression graphs permit one
to trace and thereby understand pathways of dependence. We use these methods to
reanalyze and interpret data from a prospective study of child development, now
known as the Mannheim Study of Children at Risk. The two related primary
features concern cognitive and motor development, at the age of 4.5 and 8 years
of a child. Deficits in these features form a sequence of joint responses.
Several possible risks are assessed at birth of the child and when the child
reached age 3 months and 2 years.Comment: 21 pages, 7 figures, 7 tables; invited, refereed chapter in a boo
A Stronger Bell Argument for (Some Kind of) Parameter Dependence
It is widely accepted that the violation of Bell inequalities excludes local
theories of the quantum realm. This paper presents a new derivation of the
inequalities from non-trivial non-local theories and formulates a stronger Bell
argument excluding also these non-local theories. Taking into account all
possible theories, the conclusion of this stronger argument provably is the
strongest possible consequence from the violation of Bell inequalities on a
qualitative probabilistic level (given usual background assumptions). Among the
forbidden theories is a subset of outcome dependent theories showing that
outcome dependence is not sufficient for explaining a violation of Bell
inequalities. Non-local theories which can violate Bell inequalities (among
them quantum theory) are rather characterised by the fact that at least one of
the measurement outcomes in some sense (which is made precise)
probabilistically depends both on its local as well as on its distant
measurement setting ('parameter'). When Bell inequalities are found to be
violated, the true choice is not 'outcome dependence or parameter dependence'
but between two kinds of parameter dependences, one of them being what is
usually called 'parameter dependence'. Against the received view established by
Jarrett and Shimony that on a probabilistic level quantum non-locality amounts
to outcome dependence, this result confirms and makes precise Maudlin's claim
that some kind of parameter dependence is required.Comment: forthcoming in: Studies in the History and Philosophy of Modern
Physic
- …