1,279 research outputs found

    Methodological and Computational Advances for High–Dimensional Bayesian Regression with Binary and Categorical Responses

    Get PDF
    Probit and logistic regressions are among the most popular and well-established formulations to model binary observations, thanks to their plain structure and high interpretability. Despite their simplicity, their use poses non-trivial hindrances to the inferential procedure, particularly from a computational perspective and in high-dimensional scenarios. This still motivates thriving active research for probit, logit, and a number of their generalizations, especially within the Bayesian community. Conjugacy results for standard probit regression under normal and unified skew-normal (SUN) priors appeared only recently in the literature. Such findings were rapidly extended to different generalizations of probit regression, including multinomial probit, dynamic multivariate probit and skewed Gaussian processes among others. Nonetheless, these recent developments focus on specific subclasses of models, which can all be regarded as instances of a potentially broader family of formulations, that rely on partially or fully discretized Gaussian latent utilities. As such, we develop a unified comprehensive framework that encompasses all the above constructions and many others, such as tobit regression and its extensions, for which conjugacy results are yet missing. We show that the SUN family of distribution is conjugate for all models within the broad class considered, which notably encompasses all formulations relying on likelihoods given by the product of multivariate Gaussian densities and cumulative distributions, evaluated at a linear combination of the parameter of interest. Such a unifying framework is practically and conceptually useful for studying general theoretical properties and developing future extensions. This includes new avenues for improved posterior inference exploiting i.i.d. samplers from the exact SUN posteriors and recent accurate and scalable variational Bayes (VB) approximations and expectation-propagation, for which we derive a novel efficient implementation. Along a parallel research line, we focus on binary regression under logit mapping, for which computations in high dimensions still pose open challenges. To overcome such difficulties, several contributions focus on solving iteratively a series of surrogate problems, entailing the sequential refinement of tangent lower bounds for the logistic log-likelihoods. For instance, tractable quadratic minorizers can be exploited to obtain maximum likelihood (ML) and maximum a posteriori estimates via minorize-maximize and expectation-maximization schemes, with desirable convergence guarantees. Likewise, quadratic surrogates can be used to construct Gaussian approximations of the posterior distribution in mean-field VB routines, which might however suffer from low accuracy in high dimensions. This issue can be mitigated by resorting to more flexible but involved piece-wise quadratic bounds, that however are typically defined in an implicit way and entail reduced tractability as the number of pieces increases. For this reason, we derive a novel tangent minorizer for logistic log-likelihoods, that combines the quadratic term with a single piece-wise linear contribution per each observation, proportional to the absolute value of the corresponding linear predictor. The proposed bound is guaranteed to improve the accuracy over the sharpest among quadratic minorizers, while minimizing the reduction in tractability compared to general piece-wise quadratic bounds. As opposed to the latter, its explicit analytical expression allows to simplify computations by exploiting a renowned scale-mixture representation of Laplace random variables. We investigate the benefit of the proposed methodology both in the context of penalized ML estimation, where it leads to a faster convergence rate of the optimization procedure, and of VB approximation, as the resulting accuracy improvement over mean-field strategies can be substantial in skewed and high-dimensional scenarios

    Sparse Probit Linear Mixed Model

    Full text link
    Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to continuous phenotypes. We introduce the Sparse Probit Linear Mixed Model (Probit-LMM), where we generalize the LMM modeling paradigm to binary phenotypes. As a technical challenge, the model no longer possesses a closed-form likelihood function. In this paper, we present a scalable approximate inference algorithm that lets us fit the model to high-dimensional data sets. We show on three real-world examples from different domains that in the setup of binary labels, our algorithm leads to better prediction accuracies and also selects features which show less correlation with the confounding factors.Comment: Published version, 21 pages, 6 figure

    Fast matrix computations for functional additive models

    Full text link
    It is common in functional data analysis to look at a set of related functions: a set of learning curves, a set of brain signals, a set of spatial maps, etc. One way to express relatedness is through an additive model, whereby each individual function gi(x)g_{i}\left(x\right) is assumed to be a variation around some shared mean f(x)f(x). Gaussian processes provide an elegant way of constructing such additive models, but suffer from computational difficulties arising from the matrix operations that need to be performed. Recently Heersink & Furrer have shown that functional additive model give rise to covariance matrices that have a specific form they called quasi-Kronecker (QK), whose inverses are relatively tractable. We show that under additional assumptions the two-level additive model leads to a class of matrices we call restricted quasi-Kronecker, which enjoy many interesting properties. In particular, we formulate matrix factorisations whose complexity scales only linearly in the number of functions in latent field, an enormous improvement over the cubic scaling of na\"ive approaches. We describe how to leverage the properties of rQK matrices for inference in Latent Gaussian Models

    Expectation Consistent Approximate Inference

    Get PDF
    We propose a novel framework for approximations to intractable probabilistic models which is based on a free energy formulation. The approximation can be understood from replacing an average over the original intractable distribution with a tractable one. It requires two tractable probability distributions which are made consistent on a set of moments and encode different features of the original intractable distribution. In this way we are able to use Gaussian approximations for models with discrete or bounded variables which allow us to include non-trivial correlations which are neglected in many other methods. We test the framework on toy benchmark problems for binary variables on fully connected graphs and 2D grids and compare with other methods, such as loopy belief propagation. Good performance is already achieved by using single nodes as tractable substructures. Significant improvements are obtained when a spanning tree is used instead. 1

    Dynamic Compressive Sensing of Time-Varying Signals via Approximate Message Passing

    Full text link
    In this work the dynamic compressive sensing (CS) problem of recovering sparse, correlated, time-varying signals from sub-Nyquist, non-adaptive, linear measurements is explored from a Bayesian perspective. While there has been a handful of previously proposed Bayesian dynamic CS algorithms in the literature, the ability to perform inference on high-dimensional problems in a computationally efficient manner remains elusive. In response, we propose a probabilistic dynamic CS signal model that captures both amplitude and support correlation structure, and describe an approximate message passing algorithm that performs soft signal estimation and support detection with a computational complexity that is linear in all problem dimensions. The algorithm, DCS-AMP, can perform either causal filtering or non-causal smoothing, and is capable of learning model parameters adaptively from the data through an expectation-maximization learning procedure. We provide numerical evidence that DCS-AMP performs within 3 dB of oracle bounds on synthetic data under a variety of operating conditions. We further describe the result of applying DCS-AMP to two real dynamic CS datasets, as well as a frequency estimation task, to bolster our claim that DCS-AMP is capable of offering state-of-the-art performance and speed on real-world high-dimensional problems.Comment: 32 pages, 7 figure

    Probabilistic models for structured sparsity

    Get PDF
    • …
    corecore