949 research outputs found

    Scalable Data Augmentation for Deep Learning

    Full text link
    Scalable Data Augmentation (SDA) provides a framework for training deep learning models using auxiliary hidden layers. Scalable MCMC is available for network training and inference. SDA provides a number of computational advantages over traditional algorithms, such as avoiding backtracking, local modes and can perform optimization with stochastic gradient descent (SGD) in TensorFlow. Standard deep neural networks with logit, ReLU and SVM activation functions are straightforward to implement. To illustrate our architectures and methodology, we use P\'{o}lya-Gamma logit data augmentation for a number of standard datasets. Finally, we conclude with directions for future research

    Hands-on Bayesian Neural Networks -- a Tutorial for Deep Learning Users

    Full text link
    Modern deep learning methods constitute incredibly powerful tools to tackle a myriad of challenging problems. However, since deep learning methods operate as black boxes, the uncertainty associated with their predictions is often challenging to quantify. Bayesian statistics offer a formalism to understand and quantify the uncertainty associated with deep neural network predictions. This tutorial provides an overview of the relevant literature and a complete toolset to design, implement, train, use and evaluate Bayesian Neural Networks, i.e. Stochastic Artificial Neural Networks trained using Bayesian methods.Comment: 35 pages, 15 figure

    MCMC to address model misspecification in Deep Learning classification of Radio Galaxies

    Full text link
    The radio astronomy community is adopting deep learning techniques to deal with the huge data volumes expected from the next-generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by deep learning models and will play an important role in extracting well-calibrated uncertainty estimates from the outputs of these models. However, most commonly used approximate Bayesian inference techniques such as variational inference and MCMC-based algorithms experience a "cold posterior effect (CPE)", according to which the posterior must be down-weighted in order to get good predictive performance. The CPE has been linked to several factors such as data augmentation or dataset curation leading to a misspecified likelihood and prior misspecification. In this work we use MCMC sampling to show that a Gaussian parametric family is a poor variational approximation to the true posterior and gives rise to the CPE previously observed in morphological classification of radio galaxies using variational inference based BNNs.Comment: Accepted in Machine Learning and the Physical Sciences Workshop at NeurIPS 2023; 6 pages, 1 figure, 1 tabl

    Subsampling MCMC - An introduction for the survey statistician

    Full text link
    The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with non-Bayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called pseudo-marginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.Comment: Accepted for publication in Sankhya A. Previous uploaded version contained a bug in generating the figures and reference

    BayesDLL: Bayesian Deep Learning Library

    Full text link
    We release a new Bayesian neural network library for PyTorch for large-scale deep networks. Our library implements mainstream approximate Bayesian inference algorithms: variational inference, MC-dropout, stochastic-gradient MCMC, and Laplace approximation. The main differences from other existing Bayesian neural network libraries are as follows: 1) Our library can deal with very large-scale deep networks including Vision Transformers (ViTs). 2) We need virtually zero code modifications for users (e.g., the backbone network definition codes do not neet to be modified at all). 3) Our library also allows the pre-trained model weights to serve as a prior mean, which is very useful for performing Bayesian inference with the large-scale foundation models like ViTs that are hard to optimise from scratch with the downstream data alone. Our code is publicly available at: \url{https://github.com/SamsungLabs/BayesDLL}\footnote{A mirror repository is also available at: \url{https://github.com/minyoungkim21/BayesDLL}.}

    Stochastic partial differential equation based modelling of large space-time data sets

    Full text link
    Increasingly larger data sets of processes in space and time ask for statistical models and methods that can cope with such data. We show that the solution of a stochastic advection-diffusion partial differential equation provides a flexible model class for spatio-temporal processes which is computationally feasible also for large data sets. The Gaussian process defined through the stochastic partial differential equation has in general a nonseparable covariance structure. Furthermore, its parameters can be physically interpreted as explicitly modeling phenomena such as transport and diffusion that occur in many natural processes in diverse fields ranging from environmental sciences to ecology. In order to obtain computationally efficient statistical algorithms we use spectral methods to solve the stochastic partial differential equation. This has the advantage that approximation errors do not accumulate over time, and that in the spectral space the computational cost grows linearly with the dimension, the total computational costs of Bayesian or frequentist inference being dominated by the fast Fourier transform. The proposed model is applied to postprocessing of precipitation forecasts from a numerical weather prediction model for northern Switzerland. In contrast to the raw forecasts from the numerical model, the postprocessed forecasts are calibrated and quantify prediction uncertainty. Moreover, they outperform the raw forecasts, in the sense that they have a lower mean absolute error

    Quasar Black Hole Mass Estimates in the Era of Time Domain Astronomy

    Get PDF
    We investigate the dependence of the normalization of the high-frequency part of the X-ray and optical power spectral densities (PSD) on black hole mass for a sample of 39 active galactic nuclei (AGN) with black hole masses estimated from reverberation mapping or dynamical modeling. We obtained new Swift observations of PG 1426+015, which has the largest estimated black hole mass of the AGN in our sample. We develop a novel statistical method to estimate the PSD from a lightcurve of photon counts with arbitrary sampling, eliminating the need to bin a lightcurve to achieve Gaussian statistics, and we use this technique to estimate the X-ray variability parameters for the faint AGN in our sample. We find that the normalization of the high-frequency X-ray PSD is inversely proportional to black hole mass. We discuss how to use this scaling relationship to obtain black hole mass estimates from the short time-scale X-ray variability amplitude with precision ~ 0.38 dex. The amplitude of optical variability on time scales of days is also anti-correlated with black hole mass, but with larger scatter. Instead, the optical variability amplitude exhibits the strongest anti-correlation with luminosity. We conclude with a discussion of the implications of our results for estimating black hole mass from the amplitude of AGN variability.Comment: 19 pages, 10 figures, emulateapj format, submitted to Ap
    corecore