95,658 research outputs found

    Bayesian Neural Tree Models for Nonparametric Regression

    Full text link
    Frequentist and Bayesian methods differ in many aspects, but share some basic optimal properties. In real-life classification and regression problems, situations exist in which a model based on one of the methods is preferable based on some subjective criterion. Nonparametric classification and regression techniques, such as decision trees and neural networks, have frequentist (classification and regression trees (CART) and artificial neural networks) as well as Bayesian (Bayesian CART and Bayesian neural networks) approaches to learning from data. In this work, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. Both models exploit the architecture of decision trees and have lesser number of parameters to tune than advanced neural networks. Such models can simultaneously perform feature selection and prediction, are highly flexible, and generalize well in settings with a limited number of training observations. We study the consistency of the proposed models, and derive the optimal value of an important model parameter. We also provide illustrative examples using a wide variety of real-life regression data sets

    Bayesian Augmentation of Deep Learning to Improve Video Classification

    Get PDF
    Traditional automated video classification methods lack measures of uncertainty, meaning the network is unable to identify those cases in which its predictions are made with significant uncertainty. This leads to misclassification, as the traditional network classifies each observation with same amount of certainty, no matter what the observation is. Bayesian neural networks are a remedy to this issue by leveraging Bayesian inference to construct uncertainty measures for each prediction. Because exact Bayesian inference is typically intractable due to the large number of parameters in a neural network, Bayesian inference is approximated by utilizing dropout in a convolutional neural network. This research compared a traditional video classification neural network to its Bayesian equivalent based on performance and capabilities. The Bayesian network achieves higher accuracy than a comparable non-Bayesian video network and it further provides uncertainty measures for each classification

    Mean Field Bayes Backpropagation: scalable training of multilayer neural networks with binary weights

    Full text link
    Significant success has been reported recently using deep neural networks for classification. Such large networks can be computationally intensive, even after training is over. Implementing these trained networks in hardware chips with a limited precision of synaptic weights may improve their speed and energy efficiency by several orders of magnitude, thus enabling their integration into small and low-power electronic devices. With this motivation, we develop a computationally efficient learning algorithm for multilayer neural networks with binary weights, assuming all the hidden neurons have a fan-out of one. This algorithm, derived within a Bayesian probabilistic online setting, is shown to work well for both synthetic and real-world problems, performing comparably to algorithms with real-valued weights, while retaining computational tractability

    Random deep neural networks are biased towards simple functions

    Full text link
    We prove that the binary classifiers of bit strings generated by random wide deep neural networks with ReLU activation function are biased towards simple functions. The simplicity is captured by the following two properties. For any given input bit string, the average Hamming distance of the closest input bit string with a different classification is at least sqrt(n / (2{\pi} log n)), where n is the length of the string. Moreover, if the bits of the initial string are flipped randomly, the average number of flips required to change the classification grows linearly with n. These results are confirmed by numerical experiments on deep neural networks with two hidden layers, and settle the conjecture stating that random deep neural networks are biased towards simple functions. This conjecture was proposed and numerically explored in [Valle P\'erez et al., ICLR 2019] to explain the unreasonably good generalization properties of deep learning algorithms. The probability distribution of the functions generated by random deep neural networks is a good choice for the prior probability distribution in the PAC-Bayesian generalization bounds. Our results constitute a fundamental step forward in the characterization of this distribution, therefore contributing to the understanding of the generalization properties of deep learning algorithms

    Bayesian Deep Net GLM and GLMM

    Full text link
    Deep feedforward neural networks (DFNNs) are a powerful tool for functional approximation. We describe flexible versions of generalized linear and generalized linear mixed models incorporating basis functions formed by a DFNN. The consideration of neural networks with random effects is not widely used in the literature, perhaps because of the computational challenges of incorporating subject specific parameters into already complex models. Efficient computational methods for high-dimensional Bayesian inference are developed using Gaussian variational approximation, with a parsimonious but flexible factor parametrization of the covariance matrix. We implement natural gradient methods for the optimization, exploiting the factor structure of the variational covariance matrix in computation of the natural gradient. Our flexible DFNN models and Bayesian inference approach lead to a regression and classification method that has a high prediction accuracy, and is able to quantify the prediction uncertainty in a principled and convenient way. We also describe how to perform variable selection in our deep learning method. The proposed methods are illustrated in a wide range of simulated and real-data examples, and the results compare favourably to a state of the art flexible regression and classification method in the statistical literature, the Bayesian additive regression trees (BART) method. User-friendly software packages in Matlab, R and Python implementing the proposed methods are available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table
    corecore