Search CORE

191,451 research outputs found

Graph model selection using maximum likelihood

Author: Bezáková Ivona
Kalai Adam
Santhanam Rahul
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

In recent years, there has been a proliferation of theoretical graph models, e.g., preferential attachment and small-world models, motivated by real-world graphs such as the Internet topology. To address the natural question of which model is best for a particular data set, we propose a model selection criterion for graph models. Since each model is in fact a probability distribution over graphs, we suggest using Maximum Likelihood to compare graph models and select their parameters. Interestingly, for the case of graph models, computing likelihoods is a difficult algorithmic task. However, we design and implement MCMC algorithms for computing the maximum likelihood for four popular models: a power-law random graph model, a preferential attachment model, a small-world model, and a uniform random graph model. We hope that this novel use of ML will objectify comparisons between graph models. 1

CiteSeerX

Edinburgh Research Explorer

High-Dimensional Joint Estimation of Multiple Directed Gaussian Graphical Models

Author: Segarra Santiago
Uhler Caroline
Wang Yuhao
Publication venue
Publication date: 27/06/2020
Field of study

We consider the problem of jointly estimating multiple related directed acyclic graph (DAG) models based on high-dimensional data from each graph. This problem is motivated by the task of learning gene regulatory networks based on gene expression data from different tissues, developmental stages or disease states. We prove that under certain regularity conditions, the proposed

\ell_0

-penalized maximum likelihood estimator converges in Frobenius norm to the adjacency matrices consistent with the data-generating distributions and has the correct sparsity. In particular, we show that this joint estimation procedure leads to a faster convergence rate than estimating each DAG model separately. As a corollary, we also obtain high-dimensional consistency results for causal inference from a mix of observational and interventional data. For practical purposes, we propose \emph{jointGES} consisting of Greedy Equivalence Search (GES) to estimate the union of all DAG models followed by variable selection using lasso to obtain the different DAGs, and we analyze its consistency guarantees. The proposed method is illustrated through an analysis of simulated data as well as epithelial ovarian cancer gene expression data

arXiv.org e-Print Archive

DSpace@MIT

Graphical LASSO Based Model Selection for Time Series

Author: Görtz Norbert
Hannak Gabor
Jung Alexander
Publication venue
Publication date: 28/10/2014
Field of study

We propose a novel graphical model selection (GMS) scheme for high-dimensional stationary time series or discrete time process. The method is based on a natural generalization of the graphical LASSO (gLASSO), introduced originally for GMS based on i.i.d. samples, and estimates the conditional independence graph (CIG) of a time series from a finite length observation. The gLASSO for time series is defined as the solution of an l1-regularized maximum (approximate) likelihood problem. We solve this optimization problem using the alternating direction method of multipliers (ADMM). Our approach is nonparametric as we do not assume a finite dimensional (e.g., an autoregressive) parametric model for the observed process. Instead, we require the process to be sufficiently smooth in the spectral domain. For Gaussian processes, we characterize the performance of our method theoretically by deriving an upper bound on the probability that our algorithm fails to correctly identify the CIG. Numerical experiments demonstrate the ability of our method to recover the correct CIG from a limited amount of samples

arXiv.org e-Print Archive

Nonparanormal Graph Quilting with Applications to Calcium Imaging

Author: Allen Genevera I.
Chang Andersen
Dasarthy Gautam
Zheng Lili
Publication venue
Publication date: 22/05/2023
Field of study

Probabilistic graphical models have become an important unsupervised learning tool for detecting network structures for a variety of problems, including the estimation of functional neuronal connectivity from two-photon calcium imaging data. However, in the context of calcium imaging, technological limitations only allow for partially overlapping layers of neurons in a brain region of interest to be jointly recorded. In this case, graph estimation for the full data requires inference for edge selection when many pairs of neurons have no simultaneous observations. This leads to the Graph Quilting problem, which seeks to estimate a graph in the presence of block-missingness in the empirical covariance matrix. Solutions for the Graph Quilting problem have previously been studied for Gaussian graphical models; however, neural activity data from calcium imaging are often non-Gaussian, thereby requiring a more flexible modeling approach. Thus, in our work, we study two approaches for nonparanormal Graph Quilting based on the Gaussian copula graphical model, namely a maximum likelihood procedure and a low-rank based framework. We provide theoretical guarantees on edge recovery for the former approach under similar conditions to those previously developed for the Gaussian setting, and we investigate the empirical performance of both methods using simulations as well as real data calcium imaging data. Our approaches yield more scientifically meaningful functional connectivity estimates compared to existing Gaussian graph quilting methods for this calcium imaging data set

arXiv.org e-Print Archive

Modeling unobserved heterogeneity in social network data analysis

Author: Kevork Sevag
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 27/04/2022
Field of study

The analysis of network data has become a challenging and growing field in statistics in recent years. In this context, the so-called Exponential Random Graph Model (ERGM) is a promising approach for modeling network data. However, the parameter estimation proves to be demanding, not only because of computational and stability problems, especially in large networks but also because of the unobserved presence of nodal heterogeneity in the network. This thesis begins with a general introduction to graph theory, followed by a detailed discussion of Exponential Random Graph Models and the conventional parameter estimation approaches. In addition, the advantages of this class of models are presented, and the problem of model degeneracy is discussed. The first contribution of the thesis proposes a new iterative estimation approach for Exponential Random Graph Models incorporating node-specific random effects that account for unobserved nodal heterogeneity in unipartite networks and combines both maximum likelihood and pseudolikelihood estimation methods for estimating the structural effects and the nodal random effects, respectively, to ensure stable parameter estimation. Furthermore, a model selection strategy is developed to assess the presence of nodal heterogeneity in the network. In the second contribution, the iterative estimation approach is extended to bipartite networks, explaining the estimation and the evaluation techniques. Furthermore, a thorough investigation and interpretation of nodal random effects in bipartite networks for the proposed model is discussed. Simulation studies and data examples are provided to illustrate both contributions. All developed methods are implemented using the open-source statistical software R

Digitale Hochschulschriften der LMU

Graph-constrained Analysis for Multivariate Functional Data

Author: Banerjee Sudipto
Datta Abhirup
Dey Debangan
Lindquist Martin
Publication venue
Publication date: 14/08/2023
Field of study

Functional Gaussian graphical models (GGM) used for analyzing multivariate functional data customarily estimate an unknown graphical model representing the conditional relationships between the functional variables. However, in many applications of multivariate functional data, the graph is known and existing functional GGM methods cannot preserve a given graphical constraint. In this manuscript, we demonstrate how to conduct multivariate functional analysis that exactly conforms to a given inter-variable graph. We first show the equivalence between partially separable functional GGM and graphical Gaussian processes (GP), proposed originally for constructing optimal covariance functions for multivariate spatial data that retain the conditional independence relations in a given graphical model. The theoretical connection help design a new algorithm that leverages Dempster's covariance selection to calculate the maximum likelihood estimate of the covariance function for multivariate functional data under graphical constraints. We also show that the finite term truncation of functional GGM basis expansion used in practice is equivalent to a low-rank graphical GP, which is known to oversmooth marginal distributions. To remedy this, we extend our algorithm to better preserve marginal distributions while still respecting the graph and retaining computational scalability. The insights obtained from the new results presented in this manuscript will help practitioners better understand the relationship between these graphical models and in deciding on the appropriate method for their specific multivariate data analysis task. The benefits of the proposed algorithms are illustrated using empirical experiments and an application to functional modeling of neuroimaging data using the connectivity graph among regions of the brain.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

Graphical Modelling of Multivariate Time Series

Author: Chen Chloe Chen
Chen Chloe Chen
Publication venue: Mathematics, Imperial College London
Publication date: 01/09/2011
Field of study

This thesis mainly works on the parametric graphical modelling of multivariate time series. The idea of graphical model is that each missing edge in the graph corresponds to a zero partial coherence between a pair of component processes. A vector autoregressive process (VAR) together with its associated partial correlation graph defines a graphical interaction (GI) model. The current estimation methodologies are few and lacking of details when fitting GI models. Given a realization of the VAR process, we seek to determine its graph via the GI model; we proceed by assuming each possible graph and a range of possible autoregressive orders, carrying out the estimation, and then using model-selection criteria AIC and/or BIC to select amongst the graphs and orders. We firstly consider a purely time domain approach by maximizing the conditional maximum likelihood function with zero constraints; this non-convex problem is made convex by a ‘relaxation’ step, and solved via convex optimization. The solution is exact with high probability (and would be always exact if a certain covariance matrix was block-Toeplitz). Alternatively we look at an iterative algorithm switching between time and frequency domains. It updates the spectral estimates using equations that incorporate information from the graph, and then solving the multivariate Yule-Walker equations to estimate the VAR process parameters. We show that both methods work very well on simulated data from GI models. The methods are then applied on real EEG data recorded from Schizophrenia patients, who suffer from abnormalities of brain connectivity. Though the pretreatment has been carried out to remove improper information, the raw methods do not provide any interpretive results. Some essential modification is made in the iterative algorithm by spectral up-weighting which solves the instability problem of spectral inversion efficiently. Equivalently in convex optimization method, adding noise seems also to work but interpretation of eigenvalues (small/large) is less clear. Both methods essentially delivered the same results via GI models; encouragingly the results are consistent from a completely different method based on nonparametric/multiple hypothesis testing

Spiral - Imperial College Digital Repository