Search CORE

67,579 research outputs found

Bayesian Deep Net GLM and GLMM

Author: Kohn Robert
Nguyen Nghia
Nott David
Tran Minh-Ngoc
Publication venue
Publication date: 25/05/2018
Field of study

Deep feedforward neural networks (DFNNs) are a powerful tool for functional approximation. We describe flexible versions of generalized linear and generalized linear mixed models incorporating basis functions formed by a DFNN. The consideration of neural networks with random effects is not widely used in the literature, perhaps because of the computational challenges of incorporating subject specific parameters into already complex models. Efficient computational methods for high-dimensional Bayesian inference are developed using Gaussian variational approximation, with a parsimonious but flexible factor parametrization of the covariance matrix. We implement natural gradient methods for the optimization, exploiting the factor structure of the variational covariance matrix in computation of the natural gradient. Our flexible DFNN models and Bayesian inference approach lead to a regression and classification method that has a high prediction accuracy, and is able to quantify the prediction uncertainty in a principled and convenient way. We also describe how to perform variable selection in our deep learning method. The proposed methods are illustrated in a wide range of simulated and real-data examples, and the results compare favourably to a state of the art flexible regression and classification method in the statistical literature, the Bayesian additive regression trees (BART) method. User-friendly software packages in Matlab, R and Python implementing the proposed methods are available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table

arXiv.org e-Print Archive

ScholarBank@NUS

Comparative performance of some popular ANN algorithms on benchmark and function approximation problems

Author: A. Bora
A. Bora
A. K. Tickoo
A. S. Miller
B. P. Dubey
C. L. Giles
C. Lanczos
C. T. Lin
D. Barry
D. E. Rumelhart
E. W. M. Lee
G. A. Carpenter
H. P. Singh
H. P. Singh
J. C. Jia
J. Hertz
J. Lee
L. Breiman
L. Breiman
L. Canal
M. Abramowitz
M. Bazarghan
M. Caudill
M. F. Moller
N. Metropolis
R. A. Fisher
R. Gupta
R. K. Bock
R. Koul
R. Sinkus
R. Tagliaferri
S. Chen
S. Kirkpatrick
T. Denoeux
T. Hastie
V. K. Dhar
V. K. Dhar
W. Bryc
X. Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/11/2009
Field of study

We report an inter-comparison of some popular algorithms within the artificial neural network domain (viz., Local search algorithms, global search algorithms, higher order algorithms and the hybrid algorithms) by applying them to the standard benchmarking problems like the IRIS data, XOR/N-Bit parity and Two Spiral. Apart from giving a brief description of these algorithms, the results obtained for the above benchmark problems are presented in the paper. The results suggest that while Levenberg-Marquardt algorithm yields the lowest RMS error for the N-bit Parity and the Two Spiral problems, Higher Order Neurons algorithm gives the best results for the IRIS data problem. The best results for the XOR problem are obtained with the Neuro Fuzzy algorithm. The above algorithms were also applied for solving several regression problems such as cos(x) and a few special functions like the Gamma function, the complimentary Error function and the upper tail cumulative

\chi^2

-distribution function. The results of these regression problems indicate that, among all the ANN algorithms used in the present study, Levenberg-Marquardt algorithm yields the best results. Keeping in view the highly non-linear behaviour and the wide dynamic range of these functions, it is suggested that these functions can be also considered as standard benchmark problems for function approximation using artificial neural networks.Comment: 18 pages 5 figures. Accepted in Pramana- Journal of Physic

arXiv.org e-Print Archive

Crossref

Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

Author: Liao Qianli
Mhaskar Hrushikesh
Miranda Brando
Poggio Tomaso
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2017
Field of study

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

arXiv.org e-Print Archive

DSpace@MIT

Caltech Authors

Archivio istituzionale della ricerca - Università di Genova

Random deep neural networks are biased towards simple functions

Author: De Palma Giacomo
Kiani Bobak Toussi
Lloyd Seth
Publication venue
Publication date: 01/01/2019
Field of study

We prove that the binary classifiers of bit strings generated by random wide deep neural networks with ReLU activation function are biased towards simple functions. The simplicity is captured by the following two properties. For any given input bit string, the average Hamming distance of the closest input bit string with a different classification is at least sqrt(n / (2{\pi} log n)), where n is the length of the string. Moreover, if the bits of the initial string are flipped randomly, the average number of flips required to change the classification grows linearly with n. These results are confirmed by numerical experiments on deep neural networks with two hidden layers, and settle the conjecture stating that random deep neural networks are biased towards simple functions. This conjecture was proposed and numerically explored in [Valle P\'erez et al., ICLR 2019] to explain the unreasonably good generalization properties of deep learning algorithms. The probability distribution of the functions generated by random deep neural networks is a good choice for the prior probability distribution in the PAC-Bayesian generalization bounds. Our results constitute a fundamental step forward in the characterization of this distribution, therefore contributing to the understanding of the generalization properties of deep learning algorithms

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Hamiltonian Monte Carlo Acceleration Using Surrogate Functions with Random Bases

Author: Shahbaba Babak
Zhang Cheng
Zhao Hongkai
Publication venue
Publication date: 17/04/2017
Field of study

For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov Chain Monte Carlo (MCMC) methods, namely, Hamiltonian Monte Carlo (HMC). The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the art methods

arXiv.org e-Print Archive

eScholarship - University of California