2,140 research outputs found

    Patterns of Scalable Bayesian Inference

    Full text link
    Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

    Less is More: Nystr\"om Computational Regularization

    Get PDF
    We study Nystr\"om type subsampling approaches to large scale kernel methods, and prove learning bounds in the statistical learning setting, where random sampling and high probability estimates are considered. In particular, we prove that these approaches can achieve optimal learning bounds, provided the subsampling level is suitably chosen. These results suggest a simple incremental variant of Nystr\"om Kernel Regularized Least Squares, where the subsampling level implements a form of computational regularization, in the sense that it controls at the same time regularization and computations. Extensive experimental analysis shows that the considered approach achieves state of the art performances on benchmark large scale datasets.Comment: updated version of NIPS 2015 (oral

    Testing for threshold effects in regression models

    Get PDF
    In this article, we develop a general method for testing threshold effects in regression models, using sup-likelihood-ratio (LR)-type statistics. Although the sup-LR-type test statistic has been considered in the literature, our method for establishing the asymptotic null distribution is new and nonstandard. The standard approach in the literature for obtaining the asymptotic null distribution requires that there exist a certain quadratic approximation to the objective function. The article provides an alternative, novel method that can be used to establish the asymptotic null distribution, even when the usual quadratic approximation is intractable. We illustrate the usefulness of our approach in the examples of the maximum score estimation, maximum likelihood estimation, quantile regression, and maximum rank correlation estimation. We establish consistency and local power properties of the test. We provide some simulation results and also an empirical application to tipping in racial segregation. This article has supplementary materials online.

    Signatures of criticality arise in simple neural population models with correlations

    Full text link
    Large-scale recordings of neuronal activity make it possible to gain insights into the collective activity of neural ensembles. It has been hypothesized that neural populations might be optimized to operate at a 'thermodynamic critical point', and that this property has implications for information processing. Support for this notion has come from a series of studies which identified statistical signatures of criticality in the ensemble activity of retinal ganglion cells. What are the underlying mechanisms that give rise to these observations? Here we show that signatures of criticality arise even in simple feed-forward models of retinal population activity. In particular, they occur whenever neural population data exhibits correlations, and is randomly sub-sampled during data analysis. These results show that signatures of criticality are not necessarily indicative of an optimized coding strategy, and challenge the utility of analysis approaches based on equilibrium thermodynamics for understanding partially observed biological systems.Comment: 36 pages, LaTeX; added journal reference on page 1, added link to code repositor

    Automatic Differentiation Variational Inference

    Full text link
    Probabilistic modeling is iterative. A scientist posits a simple model, fits it to her data, refines it according to her analysis, and repeats. However, fitting complex models to large data is a bottleneck in this process. Deriving algorithms for new models can be both mathematically and computationally challenging, which makes it difficult to efficiently cycle through the steps. To this end, we develop automatic differentiation variational inference (ADVI). Using our method, the scientist only provides a probabilistic model and a dataset, nothing else. ADVI automatically derives an efficient variational inference algorithm, freeing the scientist to refine and explore many models. ADVI supports a broad class of models-no conjugacy assumptions are required. We study ADVI across ten different models and apply it to a dataset with millions of observations. ADVI is integrated into Stan, a probabilistic programming system; it is available for immediate use

    Restricted Isometries for Partial Random Circulant Matrices

    Get PDF
    In the theory of compressed sensing, restricted isometry analysis has become a standard tool for studying how efficiently a measurement matrix acquires information about sparse and compressible signals. Many recovery algorithms are known to succeed when the restricted isometry constants of the sampling matrix are small. Many potential applications of compressed sensing involve a data-acquisition process that proceeds by convolution with a random pulse followed by (nonrandom) subsampling. At present, the theoretical analysis of this measurement technique is lacking. This paper demonstrates that the ssth order restricted isometry constant is small when the number mm of samples satisfies m(slogn)3/2m \gtrsim (s \log n)^{3/2}, where nn is the length of the pulse. This bound improves on previous estimates, which exhibit quadratic scaling
    corecore