30,895 research outputs found
Recurrent Spatial Transformer Networks
We integrate the recently proposed spatial transformer network (SPN)
[Jaderberg et. al 2015] into a recurrent neural network (RNN) to form an
RNN-SPN model. We use the RNN-SPN to classify digits in cluttered MNIST
sequences. The proposed model achieves a single digit error of 1.5% compared to
2.9% for a convolutional networks and 2.0% for convolutional networks with SPN
layers. The SPN outputs a zoomed, rotated and skewed version of the input
image. We investigate different down-sampling factors (ratio of pixel in input
and output) for the SPN and show that the RNN-SPN model is able to down-sample
the input images without deteriorating performance. The down-sampling in
RNN-SPN can be thought of as adaptive down-sampling that minimizes the
information loss in the regions of interest. We attribute the superior
performance of the RNN-SPN to the fact that it can attend to a sequence of
regions of interest
ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing
We address the problem of finding realistic geometric corrections to a
foreground object such that it appears natural when composited into a
background image. To achieve this, we propose a novel Generative Adversarial
Network (GAN) architecture that utilizes Spatial Transformer Networks (STNs) as
the generator, which we call Spatial Transformer GANs (ST-GANs). ST-GANs seek
image realism by operating in the geometric warp parameter space. In
particular, we exploit an iterative STN warping scheme and propose a sequential
training strategy that achieves better results compared to naive training of a
single generator. One of the key advantages of ST-GAN is its applicability to
high-resolution images indirectly since the predicted warp parameters are
transferable between reference frames. We demonstrate our approach in two
applications: (1) visualizing how indoor furniture (e.g. from product images)
might be perceived in a room, (2) hallucinating how accessories like glasses
would look when matched with real portraits.Comment: Accepted to CVPR 2018 (website & code:
https://chenhsuanlin.bitbucket.io/spatial-transformer-GAN/
Long Short-Term Memory Spatial Transformer Network
Spatial transformer network has been used in a layered form in conjunction
with a convolutional network to enable the model to transform data spatially.
In this paper, we propose a combined spatial transformer network (STN) and a
Long Short-Term Memory network (LSTM) to classify digits in sequences formed by
MINST elements. This LSTM-STN model has a top-down attention mechanism profit
from LSTM layer, so that the STN layer can perform short-term independent
elements for the statement in the process of spatial transformation, thus
avoiding the distortion that may be caused when the entire sequence is
spatially transformed. It also avoids the influence of this distortion on the
subsequent classification process using convolutional neural networks and
achieves a single digit error of 1.6\% compared with 2.2\% of Convolutional
Neural Network with STN layer
- …