286 research outputs found

    Variational Bayesian multinomial probit regression with Gaussian process priors

    Get PDF
    It is well known in the statistics literature that augmenting binary and polychotomous response models with Gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favour of Gaussian Process (GP) priors over functions, and employing variational approximations to the full posterior we obtain efficient computational methods for Gaussian Process classification in the multi-class setting. The model augmentation with additional latent variables ensures full a posteriori class coupling whilst retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multi-class Informative Vector Machines (IVM), emerge in a very natural and straightforward manner. This is the first time that a fully Variational Bayesian treatment for multi-class GP classification has been developed without having to resort to additional explicit approximations to the non-Gaussian likelihood term. Empirical comparisons with exact analysis via MCMC and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation

    A sparse multinomial probit model for classification

    No full text
    A recent development in penalized probit modelling using a hierarchical Bayesian approach has led to a sparse binomial (two-class) probit classifier that can be trained via an EM algorithm. A key advantage of the formulation is that no tuning of hyperparameters relating to the penalty is needed thus simplifying the model selection process. The resulting model demonstrates excellent classification performance and a high degree of sparsity when used as a kernel machine. It is, however, restricted to the binary classification problem and can only be used in the multinomial situation via a one-against-all or one-against-many strategy. To overcome this, we apply the idea to the multinomial probit model. This leads to a direct multi-classification approach and is shown to give a sparse solution with accuracy and sparsity comparable with the current state-of-the-art. Comparative numerical benchmark examples are used to demonstrate the method

    A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One

    Full text link
    Multinomial probit models are widely-implemented representations which allow both classification and inference by learning changes in vectors of class probabilities with a set of p observed predictors. Although various frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable class of conjugate priors, that may facilitate posterior inference on the multinomial probit coefficients. Such an issue has motivated increasing efforts toward the development of effective Markov chain Monte Carlo methods, but state-of-the-art solutions still face severe computational bottlenecks, especially in large p settings. In this article, we prove that the entire class of unified skew-normal (SUN) distributions is conjugate to a wide variety of multinomial probit models, and we exploit the SUN properties to improve upon state-of-art-solutions for posterior inference and classification both in terms of closed-form results for key functionals of interest, and also by developing novel computational methods relying either on independent and identically distributed samples from the exact posterior or on scalable and accurate variational approximations based on blocked partially-factorized representations. As illustrated in a gastrointestinal lesions application, the magnitude of the improvements relative to current methods is particularly evident, in practice, when the focus is on large p applications

    Advances in Bayesian Inference for Binary and Categorical Data

    Get PDF
    No abstract availableBayesian binary probit regression and its extensions to time-dependent observations and multi-class responses are popular tools in binary and categorical data regression due to their high interpretability and non-restrictive assumptions. Although the theory is well established in the frequentist literature, such models still face a florid research in the Bayesian framework.This is mostly due to the fact that state-of-the-art methods for Bayesian inference in such settings are either computationally impractical or inaccurate in high dimensions and in many cases a closed-form expression for the posterior distribution of the model parameters is, apparently, lacking.The development of improved computational methods and theoretical results to perform inference with this vast class of models is then of utmost importance. In order to overcome the above-mentioned computational issues, we develop a novel variational approximation for the posterior of the coefficients in high-dimensional probit regression with binary responses and Gaussian priors, resulting in a unified skew-normal (SUN) approximating distribution that converges to the exact posterior as the number of predictors p increases. Moreover, we show that closed-form expressions are actually available for posterior distributions arising from models that account for correlated binary time-series and multi-class responses. In the former case, we prove that the filtering, predictive and smoothing distributions in dynamic probit models with Gaussian state variables are, in fact, available and belong to a class of SUN distributions whose parameters can be updated recursively in time via analytical expressions, allowing to develop an i.i.d. sampler together with an optimal sequential Monte Carlo procedure. As for the latter case, i.e. multi-class probit models, we show that many different formulations developed in the literature in separate ways admit a unified view and a closed-form SUN posterior distribution under a SUN prior distribution (thus including the Gaussian case). This allows to implement computational methods which outperform state-of-the-art routines in high-dimensional settings by leveraging SUN properties and the variational methods introduced for the binary probit. Finally, motivated also by the possible linkage of some of the above-mentioned models to the Bayesian nonparametrics literature, a novel species-sampling model for partially-exchangeable observations is introduced, with the double goal of both predicting the class (or species) of the future observations and testing for homogeneity among the different available populations. Such model arises from a combination of Pitman-Yor processes and leverages on the appealing features of both hierarchical and nested structures developed in the Bayesian nonparametrics literature. Posterior inference is feasible thanks to the implementation of a marginal Gibbs sampler, whose pseudo-code is given in full detail

    Methodological and Computational Advances for High–Dimensional Bayesian Regression with Binary and Categorical Responses

    Get PDF
    Probit and logistic regressions are among the most popular and well-established formulations to model binary observations, thanks to their plain structure and high interpretability. Despite their simplicity, their use poses non-trivial hindrances to the inferential procedure, particularly from a computational perspective and in high-dimensional scenarios. This still motivates thriving active research for probit, logit, and a number of their generalizations, especially within the Bayesian community. Conjugacy results for standard probit regression under normal and unified skew-normal (SUN) priors appeared only recently in the literature. Such findings were rapidly extended to different generalizations of probit regression, including multinomial probit, dynamic multivariate probit and skewed Gaussian processes among others. Nonetheless, these recent developments focus on specific subclasses of models, which can all be regarded as instances of a potentially broader family of formulations, that rely on partially or fully discretized Gaussian latent utilities. As such, we develop a unified comprehensive framework that encompasses all the above constructions and many others, such as tobit regression and its extensions, for which conjugacy results are yet missing. We show that the SUN family of distribution is conjugate for all models within the broad class considered, which notably encompasses all formulations relying on likelihoods given by the product of multivariate Gaussian densities and cumulative distributions, evaluated at a linear combination of the parameter of interest. Such a unifying framework is practically and conceptually useful for studying general theoretical properties and developing future extensions. This includes new avenues for improved posterior inference exploiting i.i.d. samplers from the exact SUN posteriors and recent accurate and scalable variational Bayes (VB) approximations and expectation-propagation, for which we derive a novel efficient implementation. Along a parallel research line, we focus on binary regression under logit mapping, for which computations in high dimensions still pose open challenges. To overcome such difficulties, several contributions focus on solving iteratively a series of surrogate problems, entailing the sequential refinement of tangent lower bounds for the logistic log-likelihoods. For instance, tractable quadratic minorizers can be exploited to obtain maximum likelihood (ML) and maximum a posteriori estimates via minorize-maximize and expectation-maximization schemes, with desirable convergence guarantees. Likewise, quadratic surrogates can be used to construct Gaussian approximations of the posterior distribution in mean-field VB routines, which might however suffer from low accuracy in high dimensions. This issue can be mitigated by resorting to more flexible but involved piece-wise quadratic bounds, that however are typically defined in an implicit way and entail reduced tractability as the number of pieces increases. For this reason, we derive a novel tangent minorizer for logistic log-likelihoods, that combines the quadratic term with a single piece-wise linear contribution per each observation, proportional to the absolute value of the corresponding linear predictor. The proposed bound is guaranteed to improve the accuracy over the sharpest among quadratic minorizers, while minimizing the reduction in tractability compared to general piece-wise quadratic bounds. As opposed to the latter, its explicit analytical expression allows to simplify computations by exploiting a renowned scale-mixture representation of Laplace random variables. We investigate the benefit of the proposed methodology both in the context of penalized ML estimation, where it leads to a faster convergence rate of the optimization procedure, and of VB approximation, as the resulting accuracy improvement over mean-field strategies can be substantial in skewed and high-dimensional scenarios

    Scalable computation of predictive probabilities in probit models with Gaussian process priors

    Full text link
    Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, unlike for continuous settings, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximation strategies provide common solutions to this problem, but state-of-the-art algorithms are either computationally intractable or inaccurate in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate these quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities, and on mean-field variational approximations of multivariate truncated normals. Closed-form expressions for the marginal likelihood and for the posterior distribution of the Gaussian process are also discussed. As shown in simulated and real-world empirical studies, the proposed methods scale to dimensions where state-of-the-art solutions are impractical.Comment: 21 pages, 4 figure
    • …
    corecore