18 research outputs found

    A sparse ising model with covariates

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/109784/1/biom12202.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/109784/2/biom12202-sm-0001-SupData-S1.pd

    Sparse Nonparametric Graphical Models

    Full text link
    We present some nonparametric methods for graphical modeling. In the discrete case, where the data are binary or drawn from a finite alphabet, Markov random fields are already essentially nonparametric, since the cliques can take only a finite number of values. Continuous data are different. The Gaussian graphical model is the standard parametric model for continuous data, but it makes distributional assumptions that are often unrealistic. We discuss two approaches to building more flexible graphical models. One allows arbitrary graphs and a nonparametric extension of the Gaussian; the other uses kernel density estimation and restricts the graphs to trees and forests. Examples of both methods are presented. We also discuss possible future research directions for nonparametric graphical modeling.Comment: Published in at http://dx.doi.org/10.1214/12-STS391 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    TIGER: A Tuning-Insensitive Approach for Optimally Estimating Gaussian Graphical Models

    Full text link
    We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuning-free and non-asymptotically tuning-insensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is significantly faster than existing methods due to its tuning-insensitive property. Theoretically, the obtained estimator is simultaneously minimax optimal for precision matrix estimation under different norms. Empirically, we illustrate the advantages of our method using thorough simulated and real examples. The R package bigmatrix implementing the proposed methods is available on the Comprehensive R Archive Network: http://cran.r-project.org/

    Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data

    Full text link
    In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν\nu-H\"{o}lder conditional variance-covariance matrices with ν(0,1]\nu\in(0,1]. We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.Comment: 58 pages, 12 figures, accepted by Journal of the American Statistical Associatio

    Network Regression with Graph Laplacians

    Full text link
    Network data are increasingly available in various research fields, motivating statistical analysis for populations of networks where a network as a whole is viewed as a data point. Due to the non-Euclidean nature of networks, basic statistical tools available for scalar and vector data are no longer applicable when one aims to relate networks as outcomes to Euclidean covariates, while the study of how a network changes in dependence on covariates is often of paramount interest. This motivates to extend the notion of regression to the case of responses that are network data. Here we propose to adopt conditional Fr\'{e}chet means implemented with both global least squares regression and local weighted least squares smoothing, extending the Fr\'{e}chet regression concept to networks that are quantified by their graph Laplacians. The challenge is to characterize the space of graph Laplacians so as to justify the application of Fr\'{e}chet regression. This characterization then leads to asymptotic rates of convergence for the corresponding M-estimators by applying empirical process methods. We demonstrate the usefulness and good practical performance of the proposed framework with simulations and with network data arising from resting-state fMRI in neuroimaging, as well as New York taxi records.Comment: 41 pages, 13 figure

    Extension of generalized additive models to graph-valued data with application to Covid-19 impacts on European air transportation

    Get PDF
    Die Entwicklung statistischer Methoden für Objekte mit geometrischer Struktur ermöglicht die Analyse komplexer Daten in dem zugrundeliegenden Raum. Graphen wertige Daten mit nicht gelabelten Knoten lassen sich mathematisch in einem nicht Euklidischen, metrischen Raum darstellen. Mithilfe dieser Darstellung wird ein lineares Regressionsmodell zu additiven und generalisierten additiven Regressionsmodellen für Graphen wertige Daten als Regressand erweitert. Außerdem werden Interpretationen und die Form der dazugehörigen Regressionsfunktionen untersucht. Abschließend wird ein generalisiertes additives Modell auf Teile des Europäischen Fluggastverkehrs während der Covid-19 Pandemie angewendet.The development of statistics for objects with geometric structure enables researchers to account appropriately for the space of complex data in their analysis. Graph-valued data with unlabeled vertices can be represented mathematically in a non-Euclidean, metric space. Utilizing this representation, an extension of a linear regression framework to an additive as well as generalized additive regression framework for graph-valued data as response is developed. Potential interpretations for and the form of the respective regression functions are studied. Moreover, the generalized additive framework is applied to an air passenger network of parts of the European Union during the Covid-19 pandemic
    corecore