Search CORE

32,245 research outputs found

Layer-wise learning of deep generative models

Author: Arnold Ludovic
Ollivier Yann
Publication venue
Publication date: 16/02/2013
Field of study

When using deep, multi-layered architectures to build generative models of data, it is difficult to train all layers at once. We propose a layer-wise training procedure admitting a performance guarantee compared to the global optimum. It is based on an optimistic proxy of future performance, the best latent marginal. We interpret auto-encoders in this setting as generative models, by showing that they train a lower bound of this criterion. We test the new learning procedure against a state of the art method (stacked RBMs), and find it to improve performance. Both theory and experiments highlight the importance, when training deep architectures, of using an inference model (from data to hidden variables) richer than the generative model (from hidden variables to data)

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

OnionNet: Sharing Features in Cascaded Deep Classifiers

Author: Komodakis Nikos
Simonovsky Martin
Publication venue
Publication date: 01/01/2016
Field of study

The focus of our work is speeding up evaluation of deep neural networks in retrieval scenarios, where conventional architectures may spend too much time on negative examples. We propose to replace a monolithic network with our novel cascade of feature-sharing deep classifiers, called OnionNet, where subsequent stages may add both new layers as well as new feature channels to the previous ones. Importantly, intermediate feature maps are shared among classifiers, preventing them from the necessity of being recomputed. To accomplish this, the model is trained end-to-end in a principled way under a joint loss. We validate our approach in theory and on a synthetic benchmark. As a result demonstrated in three applications (patch matching, object detection, and image retrieval), our cascade can operate significantly faster than both monolithic networks and traditional cascades without sharing at the cost of marginal decrease in precision.Comment: Accepted to BMVC 201

arXiv.org e-Print Archive

Crossref

Deep Self-Taught Learning for Handwritten Character Recognition

Author: Bastien Frédéric
Bengio Yoshua
Bergeron Arnaud
Boulanger-Lewandowski Nicolas
Breuel Thomas
Chherawala Youssouf
Cisse Moustapha
Côté Myriam
Erhan Dumitru
Eustache Jeremy
Glorot Xavier
Lebeuf Sylvain Pannetier
Muller Xavier
Pascanu Razvan
Rifai Salah
Savard Francois
Sicard Guillaume
Publication venue
Publication date: 01/01/2010
Field of study

Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution examples}. For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set. We show that {\em deep learners benefit more from out-of-distribution examples than a corresponding shallow learner}, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition

arXiv.org e-Print Archive

CiteSeerX