10 research outputs found

    Distributed learning of Gaussian graphical models via marginal likelihoods

    Get PDF
    We consider distributed estimation of the inverse covariance matrix, also called the concentration matrix, in Gaussian graphical models. Traditional centralized estimation often requires iterative and expensive global inference and is therefore difficult in large distributed networks. In this paper, we propose a general framework for distributed estimation based on a maximum marginal likelihood (MML) approach. Each node independently computes a local estimate by maximizing a marginal likelihood defined with respect to data collected from its local neighborhood. Due to the non-convexity of the MML problem, we derive and consider solving a convex relaxation. The local estimates are then combined into a global estimate without the need for iterative message-passing between neighborhoods. We prove that this relaxed MML estimator is asymptotically consistent. Through numerical experiments on several synthetic and real-world data sets, we demonstrate that the two-hop version of the proposed estimator is significantly better than the one-hop version, and nearly closes the gap to the centralized maximum likelihood estimator in many situations.

    Distributed Parameter Estimation in Probabilistic Graphical Models

    Full text link
    This paper presents foundational theoretical results on distributed parameter estimation for undirected probabilistic graphical models. It introduces a general condition on composite likelihood decompositions of these models which guarantees the global consistency of distributed estimators, provided the local estimators are consistent

    Linear and Parallel Learning of Markov Random Fields

    Full text link
    We introduce a new embarrassingly parallel parameter learning algorithm for Markov random fields with untied parameters which is efficient for a large class of practical models. Our algorithm parallelizes naturally over cliques and, for graphs of bounded degree, its complexity is linear in the number of cliques. Unlike its competitors, our algorithm is fully parallel and for log-linear models it is also data efficient, requiring only the local sufficient statistics of the data to estimate parameters

    Testing the Structure of a Gaussian Graphical Model with Reduced Transmissions in a Distributed Setting

    Full text link
    Testing a covariance matrix following a Gaussian graphical model (GGM) is considered in this paper based on observations made at a set of distributed sensors grouped into clusters. Ordered transmissions are proposed to achieve the same Bayes risk as the optimum centralized energy unconstrained approach but with fewer transmissions and a completely distributed approach. In this approach, we represent the Bayes optimum test statistic as a sum of local test statistics which can be calculated by only utilizing the observations available at one cluster. We select one sensor to be the cluster head (CH) to collect and summarize the observed data in each cluster and intercluster communications are assumed to be inexpensive. The CHs with more informative observations transmit their data to the fusion center (FC) first. By halting before all transmissions have taken place, transmissions can be saved without performance loss. It is shown that this ordering approach can guarantee a lower bound on the average number of transmissions saved for any given GGM and the lower bound can approach approximately half the number of clusters when the minimum eigenvalue of the covariance matrix under the alternative hypothesis in each cluster becomes sufficiently large

    DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives

    Full text link
    We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate results on CPUs and GPUs for an image segmentation problem. Compared to a serial baseline, we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare our performance to a reference, OpenMP-based algorithm, and find speedups of up to 7X (CPU).Comment: LDAV 2018, October 201

    Distributed Learning, Prediction and Detection in Probabilistic Graphs.

    Full text link
    Critical to high-dimensional statistical estimation is to exploit the structure in the data distribution. Probabilistic graphical models provide an efficient framework for representing complex joint distributions of random variables through their conditional dependency graph, and can be adapted to many high-dimensional machine learning applications. This dissertation develops the probabilistic graphical modeling technique for three statistical estimation problems arising in real-world applications: distributed and parallel learning in networks, missing-value prediction in recommender systems, and emerging topic detection in text corpora. The common theme behind all proposed methods is a combination of parsimonious representation of uncertainties in the data, optimization surrogate that leads to computationally efficient algorithms, and fundamental limits of estimation performance in high dimension. More specifically, the dissertation makes the following theoretical contributions: (1) We propose a distributed and parallel framework for learning the parameters in Gaussian graphical models that is free of iterative global message passing. The proposed distributed estimator is shown to be asymptotically consistent, improve with increasing local neighborhood sizes, and have a high-dimensional error rate comparable to that of the centralized maximum likelihood estimator. (2) We present a family of latent variable Gaussian graphical models whose marginal precision matrix has a “low-rank plus sparse” structure. Under mild conditions, we analyze the high-dimensional parameter error bounds for learning this family of models using regularized maximum likelihood estimation. (3) We consider a hypothesis testing framework for detecting emerging topics in topic models, and propose a novel surrogate test statistic for the standard likelihood ratio. By leveraging the theory of empirical processes, we prove asymptotic consistency for the proposed test and provide guarantees of the detection performance.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/110499/1/mengzs_1.pd

    Graphical model driven methods in adaptive system identification

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2016Identifying and tracking an unknown linear system from observations of its inputs and outputs is a problem at the heart of many different applications. Due to the complexity and rapid variability of modern systems, there is extensive interest in solving the problem with as little data and computation as possible. This thesis introduces the novel approach of reducing problem dimension by exploiting statistical structure on the input. By modeling the input to the system of interest as a graph-structured random process, it is shown that a large parameter identification problem can be reduced into several smaller pieces, making the overall problem considerably simpler. Algorithms that can leverage this property in order to either improve the performance or reduce the computational complexity of the estimation problem are developed. The first of these, termed the graphical expectation-maximization least squares (GEM-LS) algorithm, can utilize the reduced dimensional problems induced by the structure to improve the accuracy of the system identification problem in the low sample regime over conventional methods for linear learning with limited data, including regularized least squares methods. Next, a relaxation of the GEM-LS algorithm termed the relaxed approximate graph structured least squares (RAGS-LS) algorithm is obtained that exploits structure to perform highly efficient estimation. The RAGS-LS algorithm is then recast into a recursive framework termed the relaxed approximate graph structured recursive least squares (RAGSRLS) algorithm, which can be used to track time-varying linear systems with low complexity while achieving tracking performance comparable to much more computationally intensive methods. The performance of the algorithms developed in the thesis in applications such as channel identification, echo cancellation and adaptive equalization demonstrate that the gains admitted by the graph framework are realizable in practice. The methods have wide applicability, and in particular show promise as the estimation and adaptation algorithms for a new breed of fast, accurate underwater acoustic modems. The contributions of the thesis illustrate the power of graphical model structure in simplifying difficult learning problems, even when the target system is not directly structured.The work in this thesis was supported primarily by the Office of Naval Research through an ONR Special Research Award in Ocean Acoustics; and at various times by the National Science Foundation, the WHOI Academic Programs Office and the MIT Presidential Fellowship Program
    corecore