8 research outputs found
Bayesian Quantile Regression for Longitudinal Count Data
This work introduces Bayesian quantile regression modeling framework for the
analysis of longitudinal count data. In this model, the response variable is
not continuous and hence an artificial smoothing of counts is incorporated. The
Bayesian implementation utilizes the normal-exponential mixture representation
of the asymmetric Laplace distribution for the response variable. An efficient
Gibbs sampling algorithm is derived for fitting the model to the data. The
model is illustrated through simulation studies and implemented in an
application drawn from neurology. Model comparison demonstrates the practical
utility of the proposed model
A comprehensive study of spike and slab shrinkage priors for structurally sparse Bayesian neural networks
Network complexity and computational efficiency have become increasingly
significant aspects of deep learning. Sparse deep learning addresses these
challenges by recovering a sparse representation of the underlying target
function by reducing heavily over-parameterized deep neural networks.
Specifically, deep neural architectures compressed via structured sparsity
(e.g. node sparsity) provide low latency inference, higher data throughput, and
reduced energy consumption. In this paper, we explore two well-established
shrinkage techniques, Lasso and Horseshoe, for model compression in Bayesian
neural networks. To this end, we propose structurally sparse Bayesian neural
networks which systematically prune excessive nodes with (i) Spike-and-Slab
Group Lasso (SS-GL), and (ii) Spike-and-Slab Group Horseshoe (SS-GHS) priors,
and develop computationally tractable variational inference including
continuous relaxation of Bernoulli variables. We establish the contraction
rates of the variational posterior of our proposed models as a function of the
network topology, layer-wise node cardinalities, and bounds on the network
weights. We empirically demonstrate the competitive performance of our models
compared to the baseline models in prediction accuracy, model compression, and
inference latency
Unified Probabilistic Neural Architecture and Weight Ensembling Improves Model Robustness
Robust machine learning models with accurately calibrated uncertainties are
crucial for safety-critical applications. Probabilistic machine learning and
especially the Bayesian formalism provide a systematic framework to incorporate
robustness through the distributional estimates and reason about uncertainty.
Recent works have shown that approximate inference approaches that take the
weight space uncertainty of neural networks to generate ensemble prediction are
the state-of-the-art. However, architecture choices have mostly been ad hoc,
which essentially ignores the epistemic uncertainty from the architecture
space. To this end, we propose a Unified probabilistic architecture and weight
ensembling Neural Architecture Search (UraeNAS) that leverages advances in
probabilistic neural architecture search and approximate Bayesian inference to
generate ensembles form the joint distribution of neural network architectures
and weights. The proposed approach showed a significant improvement both with
in-distribution (0.86% in accuracy, 42% in ECE) CIFAR-10 and
out-of-distribution (2.43% in accuracy, 30% in ECE) CIFAR-10-C compared to the
baseline deterministic approach
Learning Active Subspaces for Effective and Scalable Uncertainty Quantification in Deep Neural Networks
Bayesian inference for neural networks, or Bayesian deep learning, has the
potential to provide well-calibrated predictions with quantified uncertainty
and robustness. However, the main hurdle for Bayesian deep learning is its
computational complexity due to the high dimensionality of the parameter space.
In this work, we propose a novel scheme that addresses this limitation by
constructing a low-dimensional subspace of the neural network
parameters-referred to as an active subspace-by identifying the parameter
directions that have the most significant influence on the output of the neural
network. We demonstrate that the significantly reduced active subspace enables
effective and scalable Bayesian inference via either Monte Carlo (MC) sampling
methods, otherwise computationally intractable, or variational inference.
Empirically, our approach provides reliable predictions with robust uncertainty
estimates for various regression tasks
ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society
Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details
Sparse deep neural networks have proven to be efficient for predictive model
building in large-scale studies. Although several works have studied
theoretical and numerical properties of sparse neural architectures, they have
primarily focused on the edge selection. Sparsity through edge selection might
be intuitively appealing; however, it does not necessarily reduce the
structural complexity of a network. Instead pruning excessive nodes leads to a
structurally sparse network with significant computational speedup during
inference. To this end, we propose a Bayesian sparse solution using
spike-and-slab Gaussian priors to allow for automatic node selection during
training. The use of spike-and-slab prior alleviates the need of an ad-hoc
thresholding rule for pruning. In addition, we adopt a variational Bayes
approach to circumvent the computational challenges of traditional Markov Chain
Monte Carlo (MCMC) implementation. In the context of node selection, we
establish the fundamental result of variational posterior consistency together
with the characterization of prior parameters. In contrast to the previous
works, our theoretical development relaxes the assumptions of the equal number
of nodes and uniform bounds on all network weights, thereby accommodating
sparse networks with layer-dependent node structures or coefficient bounds.
With a layer-wise characterization of prior inclusion probabilities, we discuss
the optimal contraction rates of the variational posterior. We empirically
demonstrate that our proposed approach outperforms the edge selection method in
computational complexity with similar or better predictive performance. Our
experimental evidence further substantiates that our theoretical work
facilitates layer-wise optimal node recovery
ClimSim: An open large-scale dataset for training high-resolution physics emulators in hybrid multi-scale climate simulators
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res, https://huggingface.co/datasets/LEAP/ClimSim_low-res, and https://huggingface.co/datasets/LEAP/ClimSim_low-res_aqua-planet) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society