917 research outputs found
A Primer on Bayesian Neural Networks: Review and Debates
Neural networks have achieved remarkable performance across various problem
domains, but their widespread applicability is hindered by inherent limitations
such as overconfidence in predictions, lack of interpretability, and
vulnerability to adversarial attacks. To address these challenges, Bayesian
neural networks (BNNs) have emerged as a compelling extension of conventional
neural networks, integrating uncertainty estimation into their predictive
capabilities.
This comprehensive primer presents a systematic introduction to the
fundamental concepts of neural networks and Bayesian inference, elucidating
their synergistic integration for the development of BNNs. The target audience
comprises statisticians with a potential background in Bayesian methods but
lacking deep learning expertise, as well as machine learners proficient in deep
neural networks but with limited exposure to Bayesian statistics. We provide an
overview of commonly employed priors, examining their impact on model behavior
and performance. Additionally, we delve into the practical considerations
associated with training and inference in BNNs.
Furthermore, we explore advanced topics within the realm of BNN research,
acknowledging the existence of ongoing debates and controversies. By offering
insights into cutting-edge developments, this primer not only equips
researchers and practitioners with a solid foundation in BNNs, but also
illuminates the potential applications of this dynamic field. As a valuable
resource, it fosters an understanding of BNNs and their promising prospects,
facilitating further advancements in the pursuit of knowledge and innovation.Comment: 65 page
Nonparametric Bayesian Deep Learning for Scientific Data Analysis
Deep learning (DL) has emerged as the leading paradigm for predictive modeling in a variety of domains, especially those involving large volumes of high-dimensional spatio-temporal data such as images and text. With the rise of big data in scientific and engineering problems, there is now considerable interest in the research and development of DL for scientific applications. The scientific domain, however, poses unique challenges for DL, including special emphasis on interpretability and robustness. In particular, a priority of the Department of Energy (DOE) is the research and development of probabilistic ML methods that are robust to overfitting and offer reliable uncertainty quantification (UQ) on high-dimensional noisy data that is limited in size relative to its complexity. Gaussian processes (GPs) are nonparametric Bayesian models that are naturally robust to overfitting and offer UQ out-of-the-box. Unfortunately, traditional GP methods lack the balance of expressivity and domain-specific inductive bias that is key to the success of DL. Recently, however, a number of approaches have emerged to incorporate the DL paradigm into GP methods, including deep kernel learning (DKL), deep Gaussian processes (DGPs), and neural network Gaussian processes (NNGPs). In this work, we investigate DKL, DGPs, and NNGPs as paradigms for developing robust models for scientific applications. First, we develop DKL for text classification, and apply both DKL and Bayesian neural networks (BNNs) to the problem of classifying cancer pathology reports, with BNNs attaining new state-of-the-art results. Next, we introduce the deep ensemble kernel learning (DEKL) method, which is just as powerful as DKL while admitting easier model parallelism. Finally, we derive a new model called a ``bottleneck NNGP\u27\u27 by unifying the DGP and NNGP paradigms, thus laying the groundwork for a new class of methods for future applications
- âŠ