914 research outputs found

    The Libra Toolkit for Probabilistic Models

    Full text link
    The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilistic models, including Bayesian networks, Markov networks, dependency networks, and sum-product networks. Compared to other toolkits, Libra places a greater emphasis on learning the structure of tractable models in which exact inference is efficient. It also includes a variety of algorithms for learning graphical models in which inference is potentially intractable, and for performing exact and approximate inference. Libra is released under a 2-clause BSD license to encourage broad use in academia and industry

    Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures

    Full text link
    Probabilistic graphical models are a central tool in AI; however, they are generally not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions in a tractable fashion, but still lack the expressive power of intractable models based on deep neural networks. Therefore, we introduce conditional SPNs (CSPNs), conditional density estimators for multivariate and potentially hybrid domains which allow harnessing the expressive power of neural networks while still maintaining tractability guarantees. One way to implement CSPNs is to use an existing SPN structure and condition its parameters on the input, e.g., via a deep neural network. This approach, however, might misrepresent the conditional independence structure present in data. Consequently, we also develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. Our experimental evidence demonstrates that CSPNs are competitive with other probabilistic models and yield superior performance on multilabel image classification compared to mean field and mixture density networks. Furthermore, they can successfully be employed as building blocks for structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure

    Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks

    Get PDF
    We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor 2f(n)2^{f(n)} for any sublinear function ff of the size of the input nn. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor 2câ‹…n2^{c \cdot n} for some constant c<1c < 1. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks.Comment: 18 page

    Bayesian Structure and Parameter Learning of Sum-Product Networks

    Get PDF
    Sum-product networks (SPN) are graphical models capable of handling large amount of multi- dimensional data. Unlike many other graphical models, SPNs are tractable if certain structural requirements are fulfilled; a model is called tractable if probabilistic inference can be performed in a polynomial time with respect to the size of the model. The learning of SPNs can be separated into two modes, parameter and structure learning. Many earlier approaches to SPN learning have treated the two modes as separate, but it has been found that by alternating between these two modes, good results can be achieved. One example of this kind of algorithm was presented by Trapp et al. in an article Bayesian Learning of Sum-Product Networks (NeurIPS, 2019). This thesis discusses SPNs and a Bayesian learning algorithm developed based on the earlier men- tioned algorithm, differing in some of the used methods. The algorithm by Trapp et al. uses Gibbs sampling in the parameter learning phase, whereas here Metropolis-Hasting MCMC is used. The algorithm developed for this thesis was used in two experiments, with a small and simple SPN and with a larger and more complex SPN. Also, the effect of the data set size and the complexity of the data was explored. The results were compared to the results got from running the original algorithm developed by Trapp et al. The results show that having more data in the learning phase makes the results more accurate as it is easier for the model to spot patterns from a larger set of data. It was also shown that the model was able to learn the parameters in the experiments if the data were simple enough, in other words, if the dimensions of the data contained only one distribution per dimension. In the case of more complex data, where there were multiple distributions per dimension, the struggle of the computation was seen from the results

    Exchangeable Variable Models

    Full text link
    A sequence of random variables is exchangeable if its joint distribution is invariant under variable permutations. We introduce exchangeable variable models (EVMs) as a novel class of probabilistic models whose basic building blocks are partially exchangeable sequences, a generalization of exchangeable sequences. We prove that a family of tractable EVMs is optimal under zero-one loss for a large class of functions, including parity and threshold functions, and strictly subsumes existing tractable independence-based model families. Extensive experiments show that EVMs outperform state of the art classifiers such as SVMs and probabilistic models which are solely based on independence assumptions.Comment: ICML 201
    • …
    corecore