32,245 research outputs found
Layer-wise learning of deep generative models
When using deep, multi-layered architectures to build generative models of
data, it is difficult to train all layers at once. We propose a layer-wise
training procedure admitting a performance guarantee compared to the global
optimum. It is based on an optimistic proxy of future performance, the best
latent marginal. We interpret auto-encoders in this setting as generative
models, by showing that they train a lower bound of this criterion. We test the
new learning procedure against a state of the art method (stacked RBMs), and
find it to improve performance. Both theory and experiments highlight the
importance, when training deep architectures, of using an inference model (from
data to hidden variables) richer than the generative model (from hidden
variables to data)
OnionNet: Sharing Features in Cascaded Deep Classifiers
The focus of our work is speeding up evaluation of deep neural networks in
retrieval scenarios, where conventional architectures may spend too much time
on negative examples. We propose to replace a monolithic network with our novel
cascade of feature-sharing deep classifiers, called OnionNet, where subsequent
stages may add both new layers as well as new feature channels to the previous
ones. Importantly, intermediate feature maps are shared among classifiers,
preventing them from the necessity of being recomputed. To accomplish this, the
model is trained end-to-end in a principled way under a joint loss. We validate
our approach in theory and on a synthetic benchmark. As a result demonstrated
in three applications (patch matching, object detection, and image retrieval),
our cascade can operate significantly faster than both monolithic networks and
traditional cascades without sharing at the cost of marginal decrease in
precision.Comment: Accepted to BMVC 201
Deep Self-Taught Learning for Handwritten Character Recognition
Recent theoretical and empirical work in statistical machine learning has
demonstrated the importance of learning algorithms for deep architectures,
i.e., function classes obtained by composing multiple non-linear
transformations. Self-taught learning (exploiting unlabeled examples or
examples from other distributions) has already been applied to deep learners,
but mostly to show the advantage of unlabeled examples. Here we explore the
advantage brought by {\em out-of-distribution examples}. For this purpose we
developed a powerful generator of stochastic variations and noise processes for
character images, including not only affine transformations but also slant,
local elastic deformations, changes in thickness, background images, grey level
changes, contrast, occlusion, and various types of noise. The
out-of-distribution examples are obtained from these highly distorted images or
by including examples of object classes different from those in the target test
set. We show that {\em deep learners benefit more from out-of-distribution
examples than a corresponding shallow learner}, at least in the area of
handwritten character recognition. In fact, we show that they beat previously
published results and reach human-level performance on both handwritten digit
classification and 62-class handwritten character recognition
- …