29,950 research outputs found
Binary Models for Marginal Independence
Log-linear models are a classical tool for the analysis of contingency
tables. In particular, the subclass of graphical log-linear models provides a
general framework for modelling conditional independences. However, with the
exception of special structures, marginal independence hypotheses cannot be
accommodated by these traditional models. Focusing on binary variables, we
present a model class that provides a framework for modelling marginal
independences in contingency tables. The approach taken is graphical and draws
on analogies to multivariate Gaussian models for marginal independence. For the
graphical model representation we use bi-directed graphs, which are in the
tradition of path diagrams. We show how the models can be parameterized in a
simple fashion, and how maximum likelihood estimation can be performed using a
version of the Iterated Conditional Fitting algorithm. Finally we consider
combining these models with symmetry restrictions
Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
We investigate the relationship between the structure of a discrete graphical
model and the support of the inverse of a generalized covariance matrix. We
show that for certain graph structures, the support of the inverse covariance
matrix of indicator variables on the vertices of a graph reflects the
conditional independence structure of the graph. Our work extends results that
have previously been established only in the context of multivariate Gaussian
graphical models, thereby addressing an open question about the significance of
the inverse covariance matrix of a non-Gaussian distribution. The proof
exploits a combination of ideas from the geometry of exponential families,
junction tree theory and convex analysis. These population-level results have
various consequences for graph selection methods, both known and novel,
including a novel method for structure estimation for missing or corrupted
observations. We provide nonasymptotic guarantees for such methods and
illustrate the sharpness of these predictions via simulations.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1162 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Marginal log-linear parameters for graphical Markov models
Marginal log-linear (MLL) models provide a flexible approach to multivariate
discrete data. MLL parametrizations under linear constraints induce a wide
variety of models, including models defined by conditional independences. We
introduce a sub-class of MLL models which correspond to Acyclic Directed Mixed
Graphs (ADMGs) under the usual global Markov property. We characterize for
precisely which graphs the resulting parametrization is variation independent.
The MLL approach provides the first description of ADMG models in terms of a
minimal list of constraints. The parametrization is also easily adapted to
sparse modelling techniques, which we illustrate using several examples of real
data.Comment: 36 page
Chain graph models of multivariate regression type for categorical data
We discuss a class of chain graph models for categorical variables defined by
what we call a multivariate regression chain graph Markov property. First, the
set of local independencies of these models is shown to be Markov equivalent to
those of a chain graph model recently defined in the literature. Next we
provide a parametrization based on a sequence of generalized linear models with
a multivariate logistic link function that captures all independence
constraints in any chain graph model of this kind.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ300 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Multivariate Bernoulli distribution
In this paper, we consider the multivariate Bernoulli distribution as a model
to estimate the structure of graphs with binary nodes. This distribution is
discussed in the framework of the exponential family, and its statistical
properties regarding independence of the nodes are demonstrated. Importantly
the model can estimate not only the main effects and pairwise interactions
among the nodes but also is capable of modeling higher order interactions,
allowing for the existence of complex clique effects. We compare the
multivariate Bernoulli model with existing graphical inference models - the
Ising model and the multivariate Gaussian model, where only the pairwise
interactions are considered. On the other hand, the multivariate Bernoulli
distribution has an interesting property in that independence and
uncorrelatedness of the component random variables are equivalent. Both the
marginal and conditional distributions of a subset of variables in the
multivariate Bernoulli distribution still follow the multivariate Bernoulli
distribution. Furthermore, the multivariate Bernoulli logistic model is
developed under generalized linear model theory by utilizing the canonical link
function in order to include covariate information on the nodes, edges and
cliques. We also consider variable selection techniques such as LASSO in the
logistic model to impose sparsity structure on the graph. Finally, we discuss
extending the smoothing spline ANOVA approach to the multivariate Bernoulli
logistic model to enable estimation of non-linear effects of the predictor
variables.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP10 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Star graphs induce tetrad correlations: for Gaussian as well as for binary variables
Tetrad correlations were obtained historically for Gaussian distributions
when tasks are designed to measure an ability or attitude so that a single
unobserved variable may generate the observed, linearly increasing dependences
among the tasks. We connect such generating processes to a particular type of
directed graph, the star graph, and to the notion of traceable regressions.
Tetrad correlation conditions for the existence of a single latent variable are
derived. These are needed for positive dependences not only in joint Gaussian
but also in joint binary distributions. Three applications with binary items
are given.Comment: 21 pages, 2 figures, 5 table
- …