Search CORE

538 research outputs found

Generalized Pseudolikelihood Methods for Inverse Covariance Estimation

Author: Ali Alnur
Khare Kshitij
Oh Sang-Yun
Rajaratnam Bala
Publication venue
Publication date: 14/10/2016
Field of study

We introduce PseudoNet, a new pseudolikelihood-based estimator of the inverse covariance matrix, that has a number of useful statistical and computational properties. We show, through detailed experiments with synthetic and also real-world finance as well as wind power data, that PseudoNet outperforms related methods in terms of estimation error and support recovery, making it well-suited for use in a downstream application, where obtaining low estimation error can be important. We also show, under regularity conditions, that PseudoNet is consistent. Our proof assumes the existence of accurate estimates of the diagonal entries of the underlying inverse covariance matrix; we additionally provide a two-step method to obtain these estimates, even in a high-dimensional setting, going beyond the proofs for related methods. Unlike other pseudolikelihood-based methods, we also show that PseudoNet does not saturate, i.e., in high dimensions, there is no hard limit on the number of nonzero entries in the PseudoNet estimate. We present a fast algorithm as well as screening rules that make computing the PseudoNet estimate over a range of tuning parameters tractable

arXiv.org e-Print Archive

eScholarship - University of California

Bayesian model selection for exponential random graph models via adjusted pseudolikelihoods

Author: Bouranis Lampros
Friel Nial
Maire Florian
Publication venue: 'Informa UK Limited'
Publication date: 19/10/2017
Field of study

Models with intractable likelihood functions arise in areas including network analysis and spatial statistics, especially those involving Gibbs random fields. Posterior parameter es timation in these settings is termed a doubly-intractable problem because both the likelihood function and the posterior distribution are intractable. The comparison of Bayesian models is often based on the statistical evidence, the integral of the un-normalised posterior distribution over the model parameters which is rarely available in closed form. For doubly-intractable models, estimating the evidence adds another layer of difficulty. Consequently, the selection of the model that best describes an observed network among a collection of exponential random graph models for network analysis is a daunting task. Pseudolikelihoods offer a tractable approximation to the likelihood but should be treated with caution because they can lead to an unreasonable inference. This paper specifies a method to adjust pseudolikelihoods in order to obtain a reasonable, yet tractable, approximation to the likelihood. This allows implementation of widely used computational methods for evidence estimation and pursuit of Bayesian model selection of exponential random graph models for the analysis of social networks. Empirical comparisons to existing methods show that our procedure yields similar evidence estimates, but at a lower computational cost.Comment: Supplementary material attached. To view attachments, please download and extract the gzzipped source file listed under "Other formats

arXiv.org e-Print Archive

Research Repository UCD

Irish Universities

FigShare

Residuals and goodness-of-fit tests for stationary marked Gibbs point processes

Author: Andersen
Atkinson
Baddeley
Baddeley
Baddeley
Baddeley
Baddeley
Bertin
Billiot
Bolthausen
Coeurjolly
Coeurjolly
Comets
Daley
Dereudre
Dereudre
Diggle
Diggle
Fleming
Georgii
Guan
Illian
Jensen
Jensen
Møller
Møller
Møller
Møller
Nguyen
Nguyen
Preston
Ruelle
Stoyan
Stoyan
Publication venue: 'Wiley'
Publication date: 03/02/2010
Field of study

The inspection of residuals is a fundamental step to investigate the quality of adjustment of a parametric model to data. For spatial point processes, the concept of residuals has been recently proposed by Baddeley et al. (2005) as an empirical counterpart of the {\it Campbell equilibrium} equation for marked Gibbs point processes. The present paper focuses on stationary marked Gibbs point processes and deals with asymptotic properties of residuals for such processes. In particular, the consistency and the asymptotic normality are obtained for a wide class of residuals including the classical ones (raw residuals, inverse residuals, Pearson residuals). Based on these asymptotic results, we define goodness-of-fit tests with Type-I error theoretically controlled. One of these tests constitutes an extension of the quadrat counting test widely used to test the null hypothesis of a homogeneous Poisson point process

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation

Author: Ali Alnur
Azad Ariful
Buluc Aydin
Koanantakool Penporn
Morozov Dmitriy
Oh Sang-Yun
Oliker Leonid
Yelick Katherine
Publication venue
Publication date: 01/01/2018
Field of study

Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page

arXiv.org e-Print Archive

eScholarship - University of California