33 research outputs found
Unsupervised feature learning by augmenting single images
When deep learning is applied to visual object recognition, data augmentation
is often used to generate additional training data without extra labeling cost.
It helps to reduce overfitting and increase the performance of the algorithm.
In this paper we investigate if it is possible to use data augmentation as the
main component of an unsupervised feature learning architecture. To that end we
sample a set of random image patches and declare each of them to be a separate
single-image surrogate class. We then extend these trivial one-element classes
by applying a variety of transformations to the initial 'seed' patches. Finally
we train a convolutional neural network to discriminate between these surrogate
classes. The feature representation learned by the network can then be used in
various vision tasks. We find that this simple feature learning algorithm is
surprisingly successful, achieving competitive classification results on
several popular vision datasets (STL-10, CIFAR-10, Caltech-101).Comment: ICLR 2014 workshop track submission (7 pages, 4 figures, 1 table
Multimodal Deep Learning for Robust RGB-D Object Recognition
Robust object recognition is a crucial ingredient of many, if not all,
real-world robotics applications. This paper leverages recent progress on
Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture
for object recognition. Our architecture is composed of two separate CNN
processing streams - one for each modality - which are consecutively combined
with a late fusion network. We focus on learning with imperfect sensor data, a
typical problem in real-world robotics tasks. For accurate learning, we
introduce a multi-stage training methodology and two crucial ingredients for
handling depth data with CNNs. The first, an effective encoding of depth
information for CNNs that enables learning without the need for large depth
datasets. The second, a data augmentation scheme for robust learning with depth
images by corrupting them with realistic noise patterns. We present
state-of-the-art results on the RGB-D object dataset and show recognition in
challenging RGB-D real-world noisy settings.Comment: Final version submitted to IROS'2015, results unchanged,
reformulation of some text passages in abstract and introductio
Learning to Generate Chairs, Tables and Cars with Convolutional Networks
We train generative 'up-convolutional' neural networks which are able to
generate images of objects given object style, viewpoint, and color. We train
the networks on rendered 3D models of chairs, tables, and cars. Our experiments
show that the networks do not merely learn all images by heart, but rather find
a meaningful representation of 3D models allowing them to assess the similarity
of different models, interpolate between given views to generate the missing
ones, extrapolate views, and invent new objects not present in the training set
by recombining training instances, or even two different object classes.
Moreover, we show that such generative networks can be used to find
correspondences between different objects from the dataset, outperforming
existing approaches on this task.Comment: v4: final PAMI version. New architecture figur