116,065 research outputs found
Unsupervised feature learning by augmenting single images
When deep learning is applied to visual object recognition, data augmentation
is often used to generate additional training data without extra labeling cost.
It helps to reduce overfitting and increase the performance of the algorithm.
In this paper we investigate if it is possible to use data augmentation as the
main component of an unsupervised feature learning architecture. To that end we
sample a set of random image patches and declare each of them to be a separate
single-image surrogate class. We then extend these trivial one-element classes
by applying a variety of transformations to the initial 'seed' patches. Finally
we train a convolutional neural network to discriminate between these surrogate
classes. The feature representation learned by the network can then be used in
various vision tasks. We find that this simple feature learning algorithm is
surprisingly successful, achieving competitive classification results on
several popular vision datasets (STL-10, CIFAR-10, Caltech-101).Comment: ICLR 2014 workshop track submission (7 pages, 4 figures, 1 table
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
- …