161 research outputs found

    Multiplying a Gaussian Matrix by a Gaussian Vector

    Full text link
    We provide a new and simple characterization of the multivariate generalized Laplace distribution. In particular, this result implies that the product of a Gaussian matrix with independent and identically distributed columns by an independent isotropic Gaussian vector follows a symmetric multivariate generalized Laplace distribution

    Exact Dimensionality Selection for Bayesian PCA

    Get PDF
    We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a normal-gamma prior distribution. In this context, we exhibit a closed-form expression of the marginal likelihood which allows to infer an optimal number of components. We also propose a heuristic based on the expected shape of the marginal likelihood curve in order to choose the hyperparameters. In non-asymptotic frameworks, we show on simulated data that this exact dimensionality selection approach is competitive with both Bayesian and frequentist state-of-the-art methods

    A Parsimonious Tour of Bayesian Model Uncertainty

    Full text link
    Modern statistical software and machine learning libraries are enabling semi-automated statistical inference. Within this context, it appears easier and easier to try and fit many models to the data at hand, reversing thereby the Fisherian way of conducting science by collecting data after the scientific hypothesis (and hence the model) has been determined. The renewed goal of the statistician becomes to help the practitioner choose within such large and heterogeneous families of models, a task known as model selection. The Bayesian paradigm offers a systematized way of assessing this problem. This approach, launched by Harold Jeffreys in his 1935 book Theory of Probability, has witnessed a remarkable evolution in the last decades, that has brought about several new theoretical and methodological advances. Some of these recent developments are the focus of this survey, which tries to present a unifying perspective on work carried out by different communities. In particular, we focus on non-asymptotic out-of-sample performance of Bayesian model selection and averaging techniques, and draw connections with penalized maximum likelihood. We also describe recent extensions to wider classes of probabilistic frameworks including high-dimensional, unidentifiable, or likelihood-free models

    Leveraging the Exact Likelihood of Deep Latent Variable Models

    Get PDF
    Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. Variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. The purpose of this work is to study the general properties of this quantity and to show how they can be leveraged in practice. We focus on important inferential problems that rely on the likelihood: estimation and missing data imputation. First, we investigate maximum likelihood estimation for DLVMs: in particular, we show that most unconstrained models used for continuous data have an unbounded likelihood function. This problematic behaviour is demonstrated to be a source of mode collapse. We also show how to ensure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Finally, we describe an algorithm for missing data imputation using the exact conditional likelihood of a deep latent variable model. On several data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs

    missIWAE: Deep Generative Modelling and Imputation of Incomplete Data

    Get PDF
    We consider the problem of handling missing data with deep latent variable models (DLVMs). First, we present a simple technique to train DLVMs when the training set contains missing-at-random data. Our approach, called MIWAE, is based on the importance-weighted autoencoder (IWAE), and maximises a potentially tight lower bound of the log-likelihood of the observed data. Compared to the original IWAE, our algorithm does not induce any additional computational overhead due to the missing data. We also develop Monte Carlo techniques for single and multiple imputation using a DLVM trained on an incomplete data set. We illustrate our approach by training a convolutional DLVM on a static binarisation of MNIST that contains 50% of missing pixels. Leveraging multiple imputation, a convolutional network trained on these incomplete digits has a test performance similar to one trained on complete data. On various continuous and binary data sets, we also show that MIWAE provides accurate single imputations, and is highly competitive with state-of-the-art methods.Comment: A short version of this paper was presented at the 3rd NeurIPS workshop on Bayesian Deep Learnin

    Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation

    Get PDF
    We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.Comment: Forthcoming on the Proceedings of ICML 2019. New comparisons with several different networks. We now use the Wasserstein distance to produce comparisons. Code available on GitHub. 16 pages, 5 figures, 21 table

    Asteroid Taxonomy from Cluster Analysis of Spectrometry and Albedo

    Full text link
    The classification of the minor bodies of the Solar System based on observables has been continuously developed and iterated over the past 40 years. While prior iterations followed either the availability of large observational campaigns or new instrumental capabilities opening new observational dimensions, we see the opportunity to improve primarily upon the established methodology. We developed an iteration of the asteroid taxonomy which allows the classification of partial and complete observations (i.e. visible, near-infrared, and visible-near-infrared spectrometry) and which reintroduces the visual albedo into the classification observables. The resulting class assignments are given probabilistically, enabling the uncertainty of a classification to be quantified. We built the taxonomy based on 2983 observations of 2125 individual asteroids, representing an almost tenfold increase of sample size compared with the previous taxonomy. The asteroid classes are identified in a lower-dimensional representation of the observations using a mixture of common factor analysers model. We identify 17 classes split into the three complexes C, M, and S, including the new Z-class for extremely-red objects in the main belt. The visual albedo information resolves the spectral degeneracy of the X-complex and establishes the P-class as part of the C-complex. We present a classification tool which computes probabilistic class assignments within this taxonomic scheme from asteroid observations. The taxonomic classifications of 6038 observations of 4526 individual asteroids are published. The ability to classify partial observations and the reintroduction of the visual albedo provide a taxonomy which is well suited for the current and future datasets of asteroid observations, in particular provided by the Gaia, MITHNEOS, NEO Surveyor, and SPHEREx surveys.Comment: Published in Astronomy and Astrophysics. The table of asteroid classifications and the templates of the defined taxonomic classes are available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/665/A2

    not-MIWAE: Deep Generative Modelling with Missing not at Random Data

    Full text link
    When a missing process depends on the missing values themselves, it needs to be explicitly modelled and taken into account while doing likelihood-based inference. We present an approach for building and fitting deep latent variable models (DLVMs) in cases where the missing process is dependent on the missing data. Specifically, a deep neural network enables us to flexibly model the conditional distribution of the missingness pattern given the data. This allows for incorporating prior information about the type of missingness (e.g. self-censoring) into the model. Our inference technique, based on importance-weighted variational inference, involves maximising a lower bound of the joint likelihood. Stochastic gradients of the bound are obtained by using the reparameterisation trick both in latent space and data space. We show on various kinds of data sets and missingness patterns that explicitly modelling the missing process can be invaluable.Comment: Camera-ready version for ICLR 202

    Negative Dependence Tightens Variational Bounds

    Get PDF
    International audienceImportance weighted variational inference (IWVI) is a promising strategy for learning latent variable models. IWVI uses new variational bounds, known as Monte Carlo objectives (MCOs), obtained by replacing intractable integrals by Monte Carlo estimates-usually simply obtained via importance sampling. Burda et al. (2016) showed that increasing the number of importance samples provably tightens the gap between the bound and the likelihood. We show that, in a somewhat similar fashion, increasing the negative dependence of importance weights monotonically increases the bound. To this end, we use the supermodular order as a measure of dependence. Our simple result provides theoretical support to several different approaches that leveraged negative dependence to perform efficient variational inference of deep generative models

    Unobserved classes and extra variables in high-dimensional discriminant analysis

    Full text link
    In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations.Comment: 29 pages, 29 figure
    • …
    corecore