Search CORE

12 research outputs found

Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities

Author: Jiang Wenxin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 18/10/2007
Field of study

Bayesian variable selection has gained much empirical success recently in a variety of applications when the number

K

of explanatory variables

(x_1,...,x_K)

is possibly much larger than the sample size

n

. For generalized linear models, if most of the

x_j

's have very small effects on the response

y

, we show that it is possible to use Bayesian variable selection to reduce overfitting caused by the curse of dimensionality

K\gg n

. In this approach a suitable prior can be used to choose a few out of the many

x_j

's to model

y

, so that the posterior will propose probability densities

p

that are ``often close'' to the true density

p^*

in some sense. The closeness can be described by a Hellinger distance between

p

and

p^*

that scales at a power very close to

n^{-1/2}

, which is the ``finite-dimensional rate'' corresponding to a low-dimensional situation. These findings extend some recent work of Jiang [Technical Report 05-02 (2005) Dept. Statistics, Northwestern Univ.] on consistency of Bayesian variable selection for binary classification.Comment: Published in at http://dx.doi.org/10.1214/009053607000000019 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Gaussian Mixture Regression model with logistic weights, a penalized maximum likelihood approach

Author: Cohen Serge
Montuelle Lucie
Pennec Erwan Le
Publication venue
Publication date: 01/04/2013
Field of study

We wish to estimate conditional density using Gaussian Mixture Regression model with logistic weights and means depending on the covariate. We aim at selecting the number of components of this model as well as the other parameters by a penalized maximum likelihood approach. We provide a lower bound on penalty, proportional up to a logarithmic term to the dimension of each model, that ensures an oracle inequality for our estimator. Our theoretical analysis is supported by some numerical experiments

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL UVSQ

HAL-Rennes 1

Bayesian Neural Tree Models for Nonparametric Regression

Author: Chakraborty Ashis Kumar
Chakraborty Tanujit
Kamat Gauri
Publication venue
Publication date: 27/07/2020
Field of study

Frequentist and Bayesian methods differ in many aspects, but share some basic optimal properties. In real-life classification and regression problems, situations exist in which a model based on one of the methods is preferable based on some subjective criterion. Nonparametric classification and regression techniques, such as decision trees and neural networks, have frequentist (classification and regression trees (CART) and artificial neural networks) as well as Bayesian (Bayesian CART and Bayesian neural networks) approaches to learning from data. In this work, we present two hybrid models combining the Bayesian and frequentist versions of CART and neural networks, which we call the Bayesian neural tree (BNT) models. Both models exploit the architecture of decision trees and have lesser number of parameters to tune than advanced neural networks. Such models can simultaneously perform feature selection and prediction, are highly flexible, and generalize well in settings with a limited number of training observations. We study the consistency of the proposed models, and derive the optimal value of an important model parameter. We also provide illustrative examples using a wide variety of real-life regression data sets

arXiv.org e-Print Archive

Challenges in Markov chain Monte Carlo for Bayesian neural networks

Author: Hinkle Jacob
Papamarkou Theodore
Womble David
Young M. Todd
Publication venue
Publication date: 23/02/2021
Field of study

Markov chain Monte Carlo (MCMC) methods have not been broadly adopted in Bayesian neural networks (BNNs). This paper initially reviews the main challenges in sampling from the parameter posterior of a neural network via MCMC. Such challenges culminate to lack of convergence to the parameter posterior. Nevertheless, this paper shows that a non-converged Markov chain, generated via MCMC sampling from the parameter space of a neural network, can yield via Bayesian marginalization a valuable predictive posterior of the output of the neural network. Classification examples based on multilayer perceptrons showcase highly accurate predictive posteriors. The postulate of limited scope for MCMC developments in BNNs is partially valid; an asymptotically exact parameter posterior seems less plausible, yet an accurate predictive posterior is a tenable research avenue

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent

Author: Holzmüller David
Steinwart Ingo
Publication venue
Publication date: 31/07/2020
Field of study

We prove that two-layer (Leaky)ReLU networks initialized by e.g. the widely used method proposed by He et al. (2015) and trained using gradient descent on a least-squares loss are not universally consistent. Specifically, we describe a large class of one-dimensional data-generating distributions for which, with high probability, gradient descent only finds a bad local minimum of the optimization landscape. It turns out that in these cases, the found network essentially performs linear regression even if the target function is non-linear. We further provide numerical evidence that this happens in practical situations, for some multi-dimensional distributions and that stochastic gradient descent exhibits similar behavior.Comment: Changes in v2: Single-column layout, NTK discussion, new experiment, updated introduction, improved explanations. 20 pages + 33 pages appendix. Code available at https://github.com/dholzmueller/nn_inconsistenc

arXiv.org e-Print Archive

A review of probabilistic forecasting and prediction with machine learning

Author: Papacharalampous Georgia
Tyralis Hristos
Publication venue
Publication date: 17/09/2022
Field of study

Predictions and forecasts of machine learning models should take the form of probability distributions, aiming to increase the quantity of information communicated to end users. Although applications of probabilistic prediction and forecasting with machine learning models in academia and industry are becoming more frequent, related concepts and methods have not been formalized and structured under a holistic view of the entire field. Here, we review the topic of predictive uncertainty estimation with machine learning algorithms, as well as the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions. The review covers a time period spanning from the introduction of early statistical (linear regression and time series models, based on Bayesian statistics or quantile regression) to recent machine learning algorithms (including generalized additive models for location, scale and shape, random forests, boosting and deep learning algorithms) that are more flexible by nature. The review of the progress in the field, expedites our understanding on how to develop new algorithms tailored to users' needs, since the latest advancements are based on some fundamental concepts applied to more complex algorithms. We conclude by classifying the material and discussing challenges that are becoming a hot topic of research.Comment: 83 pages, 5 figure

arXiv.org e-Print Archive

Tree models: a Bayesian perspective

Author: Egan Blaise
Publication venue
Publication date: 01/01/2006
Field of study

Submitted in partial fulfilment of the requirements for the degree of Master of Philosophy at Queen Mary, University of London, November 2006Classical tree models represent an attempt to create nonparametric models which have good predictive powers as well a simple structure readily comprehensible by non- experts. Bayesian tree models have been created by a team consisting of Chipman, George and McCulloch and second team consisting of Denison, Mallick and Smith. Both approaches employ Green's Reversible Jump Markov Chain Monte Carlo tech- nique to carry out a more e®ective search than the `greedy' methods used classically. The aim of this work is to evaluate both types of Bayesian tree models from a Bayesian perspective and compare them

Queen Mary Research Online