18 research outputs found
A sparse ising model with covariates
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/109784/1/biom12202.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/109784/2/biom12202-sm-0001-SupData-S1.pd
Sparse Nonparametric Graphical Models
We present some nonparametric methods for graphical modeling. In the discrete
case, where the data are binary or drawn from a finite alphabet, Markov random
fields are already essentially nonparametric, since the cliques can take only a
finite number of values. Continuous data are different. The Gaussian graphical
model is the standard parametric model for continuous data, but it makes
distributional assumptions that are often unrealistic. We discuss two
approaches to building more flexible graphical models. One allows arbitrary
graphs and a nonparametric extension of the Gaussian; the other uses kernel
density estimation and restricts the graphs to trees and forests. Examples of
both methods are presented. We also discuss possible future research directions
for nonparametric graphical modeling.Comment: Published in at http://dx.doi.org/10.1214/12-STS391 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
TIGER: A Tuning-Insensitive Approach for Optimally Estimating Gaussian Graphical Models
We propose a new procedure for estimating high dimensional Gaussian graphical
models. Our approach is asymptotically tuning-free and non-asymptotically
tuning-insensitive: it requires very few efforts to choose the tuning parameter
in finite sample settings. Computationally, our procedure is significantly
faster than existing methods due to its tuning-insensitive property.
Theoretically, the obtained estimator is simultaneously minimax optimal for
precision matrix estimation under different norms. Empirically, we illustrate
the advantages of our method using thorough simulated and real examples. The R
package bigmatrix implementing the proposed methods is available on the
Comprehensive R Archive Network: http://cran.r-project.org/
Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data
In a traditional Gaussian graphical model, data homogeneity is routinely
assumed with no extra variables affecting the conditional independence. In
modern genomic datasets, there is an abundance of auxiliary information, which
often gets under-utilized in determining the joint dependency structure. In
this article, we consider a Bayesian approach to model undirected graphs
underlying heterogeneous multivariate observations with additional assistance
from covariates. Building on product partition models, we propose a novel
covariate-dependent Gaussian graphical model that allows graphs to vary with
covariates so that observations whose covariates are similar share a similar
undirected graph. To efficiently embed Gaussian graphical models into our
proposed framework, we explore both Gaussian likelihood and pseudo-likelihood
functions. For Gaussian likelihood, a G-Wishart distribution is used as a
natural conjugate prior, and for the pseudo-likelihood, a product of
Gaussian-conditionals is used. Moreover, the proposed model has large prior
support and is flexible to approximate any -H\"{o}lder conditional
variance-covariance matrices with . We further show that based on
the theory of fractional likelihood, the rate of posterior contraction is
minimax optimal assuming the true density to be a Gaussian mixture with a known
number of components. The efficacy of the approach is demonstrated via
simulation studies and an analysis of a protein network for a breast cancer
dataset assisted by mRNA gene expression as covariates.Comment: 58 pages, 12 figures, accepted by Journal of the American Statistical
Associatio
Network Regression with Graph Laplacians
Network data are increasingly available in various research fields,
motivating statistical analysis for populations of networks where a network as
a whole is viewed as a data point. Due to the non-Euclidean nature of networks,
basic statistical tools available for scalar and vector data are no longer
applicable when one aims to relate networks as outcomes to Euclidean
covariates, while the study of how a network changes in dependence on
covariates is often of paramount interest. This motivates to extend the notion
of regression to the case of responses that are network data. Here we propose
to adopt conditional Fr\'{e}chet means implemented with both global least
squares regression and local weighted least squares smoothing, extending the
Fr\'{e}chet regression concept to networks that are quantified by their graph
Laplacians. The challenge is to characterize the space of graph Laplacians so
as to justify the application of Fr\'{e}chet regression. This characterization
then leads to asymptotic rates of convergence for the corresponding
M-estimators by applying empirical process methods. We demonstrate the
usefulness and good practical performance of the proposed framework with
simulations and with network data arising from resting-state fMRI in
neuroimaging, as well as New York taxi records.Comment: 41 pages, 13 figure
Graph-Valued Models for Dimensionality Reduction and Regression
International audienc
Extension of generalized additive models to graph-valued data with application to Covid-19 impacts on European air transportation
Die Entwicklung statistischer Methoden für Objekte mit geometrischer Struktur ermöglicht die Analyse komplexer Daten in dem zugrundeliegenden Raum. Graphen wertige Daten mit nicht gelabelten Knoten lassen sich mathematisch in einem nicht Euklidischen, metrischen Raum darstellen. Mithilfe dieser Darstellung wird ein lineares Regressionsmodell zu additiven und generalisierten additiven Regressionsmodellen für Graphen wertige Daten als Regressand erweitert. Außerdem werden Interpretationen und die Form der dazugehörigen Regressionsfunktionen untersucht. Abschließend wird ein generalisiertes additives Modell auf Teile des Europäischen Fluggastverkehrs während der Covid-19 Pandemie angewendet.The development of statistics for objects with geometric structure enables researchers to account appropriately for the space of complex data in their analysis. Graph-valued data with unlabeled vertices can be represented mathematically in a non-Euclidean, metric space. Utilizing this representation, an extension of a linear regression framework to an additive as well as generalized additive regression framework for graph-valued data as response is developed. Potential interpretations for and the form of the respective regression functions are studied. Moreover, the generalized additive framework is applied to an air passenger network of parts of the European Union during the Covid-19 pandemic