15,890 research outputs found
Deep Epitomic Convolutional Neural Networks
Deep convolutional neural networks have recently proven extremely competitive
in challenging image recognition tasks. This paper proposes the epitomic
convolution as a new building block for deep neural networks. An epitomic
convolution layer replaces a pair of consecutive convolution and max-pooling
layers found in standard deep convolutional neural networks. The main version
of the proposed model uses mini-epitomes in place of filters and computes
responses invariant to small translations by epitomic search instead of
max-pooling over image positions. The topographic version of the proposed model
uses large epitomes to learn filter maps organized in translational
topographies. We show that error back-propagation can successfully learn
multiple epitomic layers in a supervised fashion. The effectiveness of the
proposed method is assessed in image classification tasks on standard
benchmarks. Our experiments on Imagenet indicate improved recognition
performance compared to standard convolutional neural networks of similar
architecture. Our models pre-trained on Imagenet perform excellently on
Caltech-101. We also obtain competitive image classification results on the
small-image MNIST and CIFAR-10 datasets.Comment: 9 page
Recurrent Segmentation for Variable Computational Budgets
State-of-the-art systems for semantic image segmentation use feed-forward
pipelines with fixed computational costs. Building an image segmentation system
that works across a range of computational budgets is challenging and
time-intensive as new architectures must be designed and trained for every
computational setting. To address this problem we develop a recurrent neural
network that successively improves prediction quality with each iteration.
Importantly, the RNN may be deployed across a range of computational budgets by
merely running the model for a variable number of iterations. We find that this
architecture is uniquely suited for efficiently segmenting videos. By
exploiting the segmentation of past frames, the RNN can perform video
segmentation at similar quality but reduced computational cost compared to
state-of-the-art image segmentation methods. When applied to static images in
the PASCAL VOC 2012 and Cityscapes segmentation datasets, the RNN traces out a
speed-accuracy curve that saturates near the performance of state-of-the-art
segmentation methods
- …