8,375 research outputs found

    Iterative Maximum Likelihood on Networks

    Get PDF
    We consider n agents located on the vertices of a connected graph. Each agent v receives a signal X_v(0)~N(s, 1) where s is an unknown quantity. A natural iterative way of estimating s is to perform the following procedure. At iteration t + 1 let X_v(t + 1) be the average of X_v(t) and of X_w(t) among all the neighbors w of v. In this paper we consider a variant of simple iterative averaging, which models "greedy" behavior of the agents. At iteration t, each agent v declares the value of its estimator X_v(t) to all of its neighbors. Then, it updates X_v(t + 1) by taking the maximum likelihood (or minimum variance) estimator of s, given X_v(t) and X_w(t) for all neighbors w of v, and the structure of the graph. We give an explicit efficient procedure for calculating X_v(t), study the convergence of the process as t goes to infinity and show that if the limit exists then it is the same for all v and w. For graphs that are symmetric under actions of transitive groups, we show that the process is efficient. Finally, we show that the greedy process is in some cases more efficient than simple averaging, while in other cases the converse is true, so that, in this model, "greed" of the individual agents may or may not have an adverse affect on the outcome. The model discussed here may be viewed as the Maximum-Likelihood version of models studied in Bayesian Economics. The ML variant is more accessible and allows in particular to show the significance of symmetry in the efficiency of estimators using networks of agents.Comment: 13 pages, two figure

    A Comparison of Algorithms for Learning Hidden Variables in Normal Graphs

    Full text link
    A Bayesian factor graph reduced to normal form consists in the interconnection of diverter units (or equal constraint units) and Single-Input/Single-Output (SISO) blocks. In this framework localized adaptation rules are explicitly derived from a constrained maximum likelihood (ML) formulation and from a minimum KL-divergence criterion using KKT conditions. The learning algorithms are compared with two other updating equations based on a Viterbi-like and on a variational approximation respectively. The performance of the various algorithm is verified on synthetic data sets for various architectures. The objective of this paper is to provide the programmer with explicit algorithms for rapid deployment of Bayesian graphs in the applications.Comment: Submitted for journal publicatio

    Optimal Clustering under Uncertainty

    Full text link
    Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.Comment: 19 pages, 5 eps figures, 1 tabl

    Non-stationary continuous dynamic Bayesian networks

    Get PDF

    Inference via low-dimensional couplings

    Full text link
    We investigate the low-dimensional structure of deterministic transformations between random variables, i.e., transport maps between probability measures. In the context of statistics and machine learning, these transformations can be used to couple a tractable "reference" measure (e.g., a standard Gaussian) with a target measure of interest. Direct simulation from the desired measure can then be achieved by pushing forward reference samples through the map. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of low-dimensional couplings, induced by transport maps that are sparse and/or decomposable. Our analysis not only facilitates the construction of transformations in high-dimensional settings, but also suggests new inference methodologies for continuous non-Gaussian graphical models. For instance, in the context of nonlinear state-space models, we describe new variational algorithms for filtering, smoothing, and sequential parameter inference. These algorithms can be understood as the natural generalization---to the non-Gaussian case---of the square-root Rauch-Tung-Striebel Gaussian smoother.Comment: 78 pages, 25 figure

    Constraint Complexity of Realizations of Linear Codes on Arbitrary Graphs

    Full text link
    A graphical realization of a linear code C consists of an assignment of the coordinates of C to the vertices of a graph, along with a specification of linear state spaces and linear ``local constraint'' codes to be associated with the edges and vertices, respectively, of the graph. The \k-complexity of a graphical realization is defined to be the largest dimension of any of its local constraint codes. \k-complexity is a reasonable measure of the computational complexity of a sum-product decoding algorithm specified by a graphical realization. The main focus of this paper is on the following problem: given a linear code C and a graph G, how small can the \k-complexity of a realization of C on G be? As useful tools for attacking this problem, we introduce the Vertex-Cut Bound, and the notion of ``vc-treewidth'' for a graph, which is closely related to the well-known graph-theoretic notion of treewidth. Using these tools, we derive tight lower bounds on the \k-complexity of any realization of C on G. Our bounds enable us to conclude that good error-correcting codes can have low-complexity realizations only on graphs with large vc-treewidth. Along the way, we also prove the interesting result that the ratio of the \k-complexity of the best conventional trellis realization of a length-n code C to the \k-complexity of the best cycle-free realization of C grows at most logarithmically with codelength n. Such a logarithmic growth rate is, in fact, achievable.Comment: Submitted to IEEE Transactions on Information Theor

    Optimized Realization of Bayesian Networks in Reduced Normal Form using Latent Variable Model

    Full text link
    Bayesian networks in their Factor Graph Reduced Normal Form (FGrn) are a powerful paradigm for implementing inference graphs. Unfortunately, the computational and memory costs of these networks may be considerable, even for relatively small networks, and this is one of the main reasons why these structures have often been underused in practice. In this work, through a detailed algorithmic and structural analysis, various solutions for cost reduction are proposed. An online version of the classic batch learning algorithm is also analyzed, showing very similar results (in an unsupervised context); which is essential even if multilevel structures are to be built. The solutions proposed, together with the possible online learning algorithm, are included in a C++ library that is quite efficient, especially if compared to the direct use of the well-known sum-product and Maximum Likelihood (ML) algorithms. The results are discussed with particular reference to a Latent Variable Model (LVM) structure.Comment: 20 pages, 8 figure

    Reactive Probabilistic Programming for Scalable Bayesian Inference

    Get PDF
    • …
    corecore