Search CORE

35 research outputs found

Bayesian Compression for Deep Learning

Author: Louizos Christos
Ullrich Karen
Welling Max
Publication venue
Publication date: 01/01/2017
Field of study

Compression and computational efficiency in deep learning have become a problem of great significance. In this work, we argue that the most principled and effective way to attack this problem is by adopting a Bayesian point of view, where through sparsity inducing priors we prune large parts of the network. We introduce two novelties in this paper: 1) we use hierarchical priors to prune nodes instead of individual weights, and 2) we use the posterior uncertainties to determine the optimal fixed point precision to encode the weights. Both factors significantly contribute to achieving the state of the art in terms of compression rates, while still staying competitive with methods designed to optimize for speed or energy efficiency.Comment: Published as a conference paper at NIPS 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Dual-Space Analysis of the Sparse Linear Model

Author: Wipf David
Wu Yi
Publication venue
Publication date: 01/01/2012
Field of study

Sparse linear (or generalized linear) models combine a standard likelihood function with a sparse prior on the unknown coefficients. These priors can conveniently be expressed as a maximization over zero-mean Gaussians with different variance hyperparameters. Standard MAP estimation (Type I) involves maximizing over both the hyperparameters and coefficients, while an empirical Bayesian alternative (Type II) first marginalizes the coefficients and then maximizes over the hyperparameters, leading to a tractable posterior approximation. The underlying cost functions can be related via a dual-space framework from Wipf et al. (2011), which allows both the Type I or Type II objectives to be expressed in either coefficient or hyperparmeter space. This perspective is useful because some analyses or extensions are more conducive to development in one space or the other. Herein we consider the estimation of a trade-off parameter balancing sparsity and data fit. As this parameter is effectively a variance, natural estimators exist by assessing the problem in hyperparameter (variance) space, transitioning natural ideas from Type II to solve what is much less intuitive for Type I. In contrast, for analyses of update rules and sparsity properties of local and global solutions, as well as extensions to more general likelihood models, we can leverage coefficient-space techniques developed for Type I and apply them to Type II. For example, this allows us to prove that Type II-inspired techniques can be successful recovering sparse coefficients when unfavorable restricted isometry properties (RIP) lead to failure of popular L1 reconstructions. It also facilitates the analysis of Type II when non-Gaussian likelihood models lead to intractable integrations.Comment: 9 pages, 2 figures, submission to NIPS 201

arXiv.org e-Print Archive

CiteSeerX

A general approach to simultaneous model fitting and variable elimination in response models for biological data with many more variables than observations

Author: A Dempster
A Spira
B Schèolkopf
C Ambroise
DA Hinds
DR Cox
GN Watson
Harri T Kiiveri
HT Kiiveri
I Guyon
JA Nelder
JC Platt
JX Zhu
L Breiman
M Abramowitz
M Figueiredo
M Figueiredo
ME Ross
MY Park
P McCullagh
R Tibshirani
RDC Team
S Kotz
S Zhang
SA Tomlins
SS Dave
SS Keerthi
T Zhang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background With the advent of high throughput biotechnology data acquisition platforms such as micro arrays, SNP chips and mass spectrometers, data sets with many more variables than observations are now routinely being collected. Finding relationships between response variables of interest and variables in such data sets is an important problem akin to finding needles in a haystack. Whilst methods for a number of response types have been developed a general approach has been lacking. Results The major contribution of this paper is to present a unified methodology which allows many common (statistical) response models to be fitted to such data sets. The class of models includes virtually any model with a linear predictor in it, for example (but not limited to), multiclass logistic regression (classification), generalised linear models (regression) and survival models. A fast algorithm for finding sparse well fitting models is presented. The ideas are illustrated on real data sets with numbers of variables ranging from thousands to millions. R code implementing the ideas is available for download. Conclusion The method described in this paper enables existing work on response models when there are less variables than observations to be leveraged to the situation when there are many more variables than observations. It is a powerful approach to finding parsimonious models for such datasets. The method is capable of handling problems with millions of variables and a large variety of response types within the one framework. The method compares favourably to existing methods such as support vector machines and random forests, but has the advantage of not requiring separate variable selection steps. It is also works for data types which these methods were not designed to handle. The method usually produces very sparse models which make biological interpretation simpler and more focused.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Binary Linear Classification and Feature Selection via Generalized Approximate Message Passing

Author: Schniter Philip
Sederberg Per
Ziniel Justin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/12/2014
Field of study

For the problem of binary linear classification and feature selection, we propose algorithmic approaches to classifier design based on the generalized approximate message passing (GAMP) algorithm, recently proposed in the context of compressive sensing. We are particularly motivated by problems where the number of features greatly exceeds the number of training examples, but where only a few features suffice for accurate classification. We show that sum-product GAMP can be used to (approximately) minimize the classification error rate and max-sum GAMP can be used to minimize a wide variety of regularized loss functions. Furthermore, we describe an expectation-maximization (EM)-based scheme to learn the associated model parameters online, as an alternative to cross-validation, and we show that GAMP's state-evolution framework can be used to accurately predict the misclassification rate. Finally, we present a detailed numerical study to confirm the accuracy, speed, and flexibility afforded by our GAMP-based approaches to binary linear classification and feature selection

arXiv.org e-Print Archive

Crossref

Using prototypes to improve convolutional networks interpretability

Author: Alexandre Frédéric
Drumond Thalita,
Viéville Thierry
Publication venue: HAL CCSD
Publication date: 04/12/2017
Field of study

International audienceWe propose a method that allows the interpretation of the data representation obtained by CNN, through introducing prototypes in the feature space, that are later classified into a certain category. This way we can see how the feature space is structured in link with the categories and the related task

INRIA a CCSD electronic archive server

HAL-Rennes 1

Bayesian Compression for Deep Learning

Author: Louizos C.
Ullrich K.
Welling M.
Publication venue: Neural Information Processing Systems
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications