6,309 research outputs found
Deep Epitomic Convolutional Neural Networks
Deep convolutional neural networks have recently proven extremely competitive
in challenging image recognition tasks. This paper proposes the epitomic
convolution as a new building block for deep neural networks. An epitomic
convolution layer replaces a pair of consecutive convolution and max-pooling
layers found in standard deep convolutional neural networks. The main version
of the proposed model uses mini-epitomes in place of filters and computes
responses invariant to small translations by epitomic search instead of
max-pooling over image positions. The topographic version of the proposed model
uses large epitomes to learn filter maps organized in translational
topographies. We show that error back-propagation can successfully learn
multiple epitomic layers in a supervised fashion. The effectiveness of the
proposed method is assessed in image classification tasks on standard
benchmarks. Our experiments on Imagenet indicate improved recognition
performance compared to standard convolutional neural networks of similar
architecture. Our models pre-trained on Imagenet perform excellently on
Caltech-101. We also obtain competitive image classification results on the
small-image MNIST and CIFAR-10 datasets.Comment: 9 page
Convolutional Feature Masking for Joint Object and Stuff Segmentation
The topic of semantic segmentation has witnessed considerable progress due to
the powerful features learned by convolutional neural networks (CNNs). The
current leading approaches for semantic segmentation exploit shape information
by extracting CNN features from masked image regions. This strategy introduces
artificial boundaries on the images and may impact the quality of the extracted
features. Besides, the operations on the raw image domain require to compute
thousands of networks on a single image, which is time-consuming. In this
paper, we propose to exploit shape information via masking convolutional
features. The proposal segments (e.g., super-pixels) are treated as masks on
the convolutional feature maps. The CNN features of segments are directly
masked out from these maps and used to train classifiers for recognition. We
further propose a joint method to handle objects and "stuff" (e.g., grass, sky,
water) in the same framework. State-of-the-art results are demonstrated on
benchmarks of PASCAL VOC and new PASCAL-CONTEXT, with a compelling
computational speed.Comment: IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
201
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Hierarchical Deep Learning Architecture For 10K Objects Classification
Evolution of visual object recognition architectures based on Convolutional
Neural Networks & Convolutional Deep Belief Networks paradigms has
revolutionized artificial Vision Science. These architectures extract & learn
the real world hierarchical visual features utilizing supervised & unsupervised
learning approaches respectively. Both the approaches yet cannot scale up
realistically to provide recognition for a very large number of objects as high
as 10K. We propose a two level hierarchical deep learning architecture inspired
by divide & conquer principle that decomposes the large scale recognition
architecture into root & leaf level model architectures. Each of the root &
leaf level models is trained exclusively to provide superior results than
possible by any 1-level deep learning architecture prevalent today. The
proposed architecture classifies objects in two steps. In the first step the
root level model classifies the object in a high level category. In the second
step, the leaf level recognition model for the recognized high level category
is selected among all the leaf models. This leaf level model is presented with
the same input object image which classifies it in a specific category. Also we
propose a blend of leaf level models trained with either supervised or
unsupervised learning approaches. Unsupervised learning is suitable whenever
labelled data is scarce for the specific leaf level models. Currently the
training of leaf level models is in progress; where we have trained 25 out of
the total 47 leaf level models as of now. We have trained the leaf models with
the best case top-5 error rate of 3.2% on the validation data set for the
particular leaf models. Also we demonstrate that the validation error of the
leaf level models saturates towards the above mentioned accuracy as the number
of epochs are increased to more than sixty.Comment: As appeared in proceedings for CS & IT 2015 - Second International
Conference on Computer Science & Engineering (CSEN 2015
- …