6,237 research outputs found
Positivity for Gaussian graphical models
Gaussian graphical models are parametric statistical models for jointly
normal random variables whose dependence structure is determined by a graph. In
previous work, we introduced trek separation, which gives a necessary and
sufficient condition in terms of the graph for when a subdeterminant is zero
for all covariance matrices that belong to the Gaussian graphical model. Here
we extend this result to give explicit cancellation-free formulas for the
expansions of nonzero subdeterminants.Comment: 16 pages, 3 figure
Total positivity in exponential families with application to binary variables
We study exponential families of distributions that are multivariate totally
positive of order 2 (MTP2), show that these are convex exponential families,
and derive conditions for existence of the MLE. Quadratic exponential familes
of MTP2 distributions contain attractive Gaussian graphical models and
ferromagnetic Ising models as special examples. We show that these are defined
by intersecting the space of canonical parameters with a polyhedral cone whose
faces correspond to conditional independence relations. Hence MTP2 serves as an
implicit regularizer for quadratic exponential families and leads to sparsity
in the estimated graphical model. We prove that the maximum likelihood
estimator (MLE) in an MTP2 binary exponential family exists if and only if both
of the sign patterns and are represented in the sample for
every pair of variables; in particular, this implies that the MLE may exist
with observations, in stark contrast to unrestricted binary exponential
families where observations are required. Finally, we provide a novel and
globally convergent algorithm for computing the MLE for MTP2 Ising models
similar to iterative proportional scaling and apply it to the analysis of data
from two psychological disorders
On the Geometry of Message Passing Algorithms for Gaussian Reciprocal Processes
Reciprocal processes are acausal generalizations of Markov processes
introduced by Bernstein in 1932. In the literature, a significant amount of
attention has been focused on developing dynamical models for reciprocal
processes. Recently, probabilistic graphical models for reciprocal processes
have been provided. This opens the way to the application of efficient
inference algorithms in the machine learning literature to solve the smoothing
problem for reciprocal processes. Such algorithms are known to converge if the
underlying graph is a tree. This is not the case for a reciprocal process,
whose associated graphical model is a single loop network. The contribution of
this paper is twofold. First, we introduce belief propagation for Gaussian
reciprocal processes. Second, we establish a link between convergence analysis
of belief propagation for Gaussian reciprocal processes and stability theory
for differentially positive systems.Comment: 15 pages; Typos corrected; This paper introduces belief propagation
for Gaussian reciprocal processes and extends the convergence analysis in
arXiv:1603.04419 to the Gaussian cas
A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training
Item neighbourhood methods for collaborative filtering learn a weighted graph
over the set of items, where each item is connected to those it is most similar
to. The prediction of a user's rating on an item is then given by that rating
of neighbouring items, weighted by their similarity. This paper presents a new
neighbourhood approach which we call item fields, whereby an undirected
graphical model is formed over the item graph. The resulting prediction rule is
a simple generalization of the classical approaches, which takes into account
non-local information in the graph, allowing its best results to be obtained
when using drastically fewer edges than other neighbourhood approaches. A fast
approximate maximum entropy training method based on the Bethe approximation is
presented, which uses a simple gradient ascent procedure. When using
precomputed sufficient statistics on the Movielens datasets, our method is
faster than maximum likelihood approaches by two orders of magnitude.Comment: ICML201
The correlation space of Gaussian latent tree models and model selection without fitting
We provide a complete description of possible covariance matrices consistent
with a Gaussian latent tree model for any tree. We then present techniques for
utilising these constraints to assess whether observed data is compatible with
that Gaussian latent tree model. Our method does not require us first to fit
such a tree. We demonstrate the usefulness of the inverse-Wishart distribution
for performing preliminary assessments of tree-compatibility using
semialgebraic constraints. Using results from Drton et al. (2008) we then
provide the appropriate moments required for test statistics for assessing
adherence to these equality constraints. These are shown to be effective even
for small sample sizes and can be easily adjusted to test either the entire
model or only certain macrostructures hypothesized within the tree. We
illustrate our exploratory tetrad analysis using a linguistic application and
our confirmatory tetrad analysis using a biological application.Comment: 15 page
Lower Bounds for Two-Sample Structural Change Detection in Ising and Gaussian Models
The change detection problem is to determine if the Markov network structures
of two Markov random fields differ from one another given two sets of samples
drawn from the respective underlying distributions. We study the trade-off
between the sample sizes and the reliability of change detection, measured as a
minimax risk, for the important cases of the Ising models and the Gaussian
Markov random fields restricted to the models which have network structures
with nodes and degree at most , and obtain information-theoretic lower
bounds for reliable change detection over these models. We show that for the
Ising model, samples are
required from each dataset to detect even the sparsest possible changes, and
that for the Gaussian, samples are
required from each dataset to detect change, where is the smallest
ratio of off-diagonal to diagonal terms in the precision matrices of the
distributions. These bounds are compared to the corresponding results in
structure learning, and closely match them under mild conditions on the model
parameters. Thus, our change detection bounds inherit partial tightness from
the structure learning schemes in previous literature, demonstrating that in
certain parameter regimes, the naive structure learning based approach to
change detection is minimax optimal up to constant factors.Comment: Presented at the 55th Annual Allerton Conference on Communication,
Control, and Computing, Oct. 201
- …