990 research outputs found
Hierarchical Data Representation Model - Multi-layer NMF
In this paper, we propose a data representation model that demonstrates
hierarchical feature learning using nsNMF. We extend unit algorithm into
several layers. Experiments with document and image data successfully
discovered feature hierarchies. We also prove that proposed method results in
much better classification and reconstruction performance, especially for small
number of features. feature hierarchies
Deep Self-Taught Learning for Handwritten Character Recognition
Recent theoretical and empirical work in statistical machine learning has
demonstrated the importance of learning algorithms for deep architectures,
i.e., function classes obtained by composing multiple non-linear
transformations. Self-taught learning (exploiting unlabeled examples or
examples from other distributions) has already been applied to deep learners,
but mostly to show the advantage of unlabeled examples. Here we explore the
advantage brought by {\em out-of-distribution examples}. For this purpose we
developed a powerful generator of stochastic variations and noise processes for
character images, including not only affine transformations but also slant,
local elastic deformations, changes in thickness, background images, grey level
changes, contrast, occlusion, and various types of noise. The
out-of-distribution examples are obtained from these highly distorted images or
by including examples of object classes different from those in the target test
set. We show that {\em deep learners benefit more from out-of-distribution
examples than a corresponding shallow learner}, at least in the area of
handwritten character recognition. In fact, we show that they beat previously
published results and reach human-level performance on both handwritten digit
classification and 62-class handwritten character recognition
FreezeOut: Accelerate Training by Progressively Freezing Layers
The early layers of a deep neural net have the fewest parameters, but take up
the most computation. In this extended abstract, we propose to only train the
hidden layers for a set portion of the training run, freezing them out
one-by-one and excluding them from the backward pass. Through experiments on
CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20%
wall-clock time during training with 3% loss in accuracy for DenseNets, a 20%
speedup without loss of accuracy for ResNets, and no improvement for VGG
networks. Our code is publicly available at
https://github.com/ajbrock/FreezeOutComment: Extended Abstrac
- …