362 research outputs found

    Stan: A Probabilistic Programming Language

    Get PDF
    Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting

    Modelling the structure of complex networks

    Get PDF

    Natively probabilistic computation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2009.Includes bibliographical references (leaves 129-135).I introduce a new set of natively probabilistic computing abstractions, including probabilistic generalizations of Boolean circuits, backtracking search and pure Lisp. I show how these tools let one compactly specify probabilistic generative models, generalize and parallelize widely used sampling algorithms like rejection sampling and Markov chain Monte Carlo, and solve difficult Bayesian inference problems. I first introduce Church, a probabilistic programming language for describing probabilistic generative processes that induce distributions, which generalizes Lisp, a language for describing deterministic procedures that induce functions. I highlight the ways randomness meshes with the reflectiveness of Lisp to support the representation of structured, uncertain knowledge, including nonparametric Bayesian models from the current literature, programs for decision making under uncertainty, and programs that learn very simple programs from data. I then introduce systematic stochastic search, a recursive algorithm for exact and approximate sampling that generalizes a popular form of backtracking search to the broader setting of stochastic simulation and recovers widely used particle filters as a special case. I use it to solve probabilistic reasoning problems from statistical physics, causal reasoning and stereo vision. Finally, I introduce stochastic digital circuits that model the probability algebra just as traditional Boolean circuits model the Boolean algebra.(cont.) I show how these circuits can be used to build massively parallel, fault-tolerant machines for sampling and allow one to efficiently run Markov chain Monte Carlo methods on models with hundreds of thousands of variables in real time. I emphasize the ways in which these ideas fit together into a coherent software and hardware stack for natively probabilistic computing, organized around distributions and samplers rather than deterministic functions. I argue that by building uncertainty and randomness into the foundations of our programming languages and computing machines, we may arrive at ones that are more powerful, flexible and efficient than deterministic designs, and are in better alignment with the needs of computational science, statistics and artificial intelligence.by Vikash Kumar Mansinghka.Ph.D

    Blang: Bayesian declarative modelling of general data structures and inference via algorithms based on distribution continua

    Get PDF
    Consider a Bayesian inference problem where a variable of interest does not take values in a Euclidean space. These "non-standard" data structures are in reality fairly common. They are frequently used in problems involving latent discrete factor models, networks, and domain specific problems such as sequence alignments and reconstructions, pedigrees, and phylogenies. In principle, Bayesian inference should be particularly well-suited in such scenarios, as the Bayesian paradigm provides a principled way to obtain confidence assessment for random variables of any type. However, much of the recent work on making Bayesian analysis more accessible and computationally efficient has focused on inference in Euclidean spaces. In this paper, we introduce Blang, a domain specific language and library aimed at bridging this gap. Blang allows users to perform Bayesian analysis on arbitrary data types while using a declarative syntax similar to BUGS. Blang is augmented with intuitive language additions to create data types of the user's choosing. To perform inference at scale on such arbitrary state spaces, Blang leverages recent advances in sequential Monte Carlo and non-reversible Markov chain Monte Carlo methods

    Towards derandomising Markov chain Monte Carlo

    Full text link
    We present a new framework to derandomise certain Markov chain Monte Carlo (MCMC) algorithms. As in MCMC, we first reduce counting problems to sampling from a sequence of marginal distributions. For the latter task, we introduce a method called coupling towards the past that can, in logarithmic time, evaluate one or a constant number of variables from a stationary Markov chain state. Since there are at most logarithmic random choices, this leads to very simple derandomisation. We provide two applications of this framework, namely efficient deterministic approximate counting algorithms for hypergraph independent sets and hypergraph colourings, under local lemma type conditions matching, up to lower order factors, their state-of-the-art randomised counterparts.Comment: 57 page

    Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration With Provable Guarantees

    Full text link
    Satisfiability Modulo Counting (SMC) encompasses problems that require both symbolic decision-making and statistical reasoning. Its general formulation captures many real-world problems at the intersection of symbolic and statistical Artificial Intelligence. SMC searches for policy interventions to control probabilistic outcomes. Solving SMC is challenging because of its highly intractable nature(NPPP\text{NP}^{\text{PP}}-complete), incorporating statistical inference and symbolic reasoning. Previous research on SMC solving lacks provable guarantees and/or suffers from sub-optimal empirical performance, especially when combinatorial constraints are present. We propose XOR-SMC, a polynomial algorithm with access to NP-oracles, to solve highly intractable SMC problems with constant approximation guarantees. XOR-SMC transforms the highly intractable SMC into satisfiability problems, by replacing the model counting in SMC with SAT formulae subject to randomized XOR constraints. Experiments on solving important SMC problems in AI for social good demonstrate that XOR-SMC finds solutions close to the true optimum, outperforming several baselines which struggle to find good approximations for the intractable model counting in SMC

    The Complexity of Approximately Counting Tree Homomorphisms

    Full text link
    We study two computational problems, parameterised by a fixed tree H. #HomsTo(H) is the problem of counting homomorphisms from an input graph G to H. #WHomsTo(H) is the problem of counting weighted homomorphisms to H, given an input graph G and a weight function for each vertex v of G. Even though H is a tree, these problems turn out to be sufficiently rich to capture all of the known approximation behaviour in #P. We give a complete trichotomy for #WHomsTo(H). If H is a star then #WHomsTo(H) is in FP. If H is not a star but it does not contain a certain induced subgraph J_3 then #WHomsTo(H) is equivalent under approximation-preserving (AP) reductions to #BIS, the problem of counting independent sets in a bipartite graph. This problem is complete for the class #RHPi_1 under AP-reductions. Finally, if H contains an induced J_3 then #WHomsTo(H) is equivalent under AP-reductions to #SAT, the problem of counting satisfying assignments to a CNF Boolean formula. Thus, #WHomsTo(H) is complete for #P under AP-reductions. The results are similar for #HomsTo(H) except that a rich structure emerges if H contains an induced J_3. We show that there are trees H for which #HomsTo(H) is #SAT-equivalent (disproving a plausible conjecture of Kelk). There is an interesting connection between these homomorphism-counting problems and the problem of approximating the partition function of the ferromagnetic Potts model. In particular, we show that for a family of graphs J_q, parameterised by a positive integer q, the problem #HomsTo(H) is AP-interreducible with the problem of approximating the partition function of the q-state Potts model. It was not previously known that the Potts model had a homomorphism-counting interpretation. We use this connection to obtain some additional upper bounds for the approximation complexity of #HomsTo(J_q)
    • …
    corecore