30,895 research outputs found

    Recurrent Spatial Transformer Networks

    Get PDF
    We integrate the recently proposed spatial transformer network (SPN) [Jaderberg et. al 2015] into a recurrent neural network (RNN) to form an RNN-SPN model. We use the RNN-SPN to classify digits in cluttered MNIST sequences. The proposed model achieves a single digit error of 1.5% compared to 2.9% for a convolutional networks and 2.0% for convolutional networks with SPN layers. The SPN outputs a zoomed, rotated and skewed version of the input image. We investigate different down-sampling factors (ratio of pixel in input and output) for the SPN and show that the RNN-SPN model is able to down-sample the input images without deteriorating performance. The down-sampling in RNN-SPN can be thought of as adaptive down-sampling that minimizes the information loss in the regions of interest. We attribute the superior performance of the RNN-SPN to the fact that it can attend to a sequence of regions of interest

    ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing

    Full text link
    We address the problem of finding realistic geometric corrections to a foreground object such that it appears natural when composited into a background image. To achieve this, we propose a novel Generative Adversarial Network (GAN) architecture that utilizes Spatial Transformer Networks (STNs) as the generator, which we call Spatial Transformer GANs (ST-GANs). ST-GANs seek image realism by operating in the geometric warp parameter space. In particular, we exploit an iterative STN warping scheme and propose a sequential training strategy that achieves better results compared to naive training of a single generator. One of the key advantages of ST-GAN is its applicability to high-resolution images indirectly since the predicted warp parameters are transferable between reference frames. We demonstrate our approach in two applications: (1) visualizing how indoor furniture (e.g. from product images) might be perceived in a room, (2) hallucinating how accessories like glasses would look when matched with real portraits.Comment: Accepted to CVPR 2018 (website & code: https://chenhsuanlin.bitbucket.io/spatial-transformer-GAN/

    Long Short-Term Memory Spatial Transformer Network

    Full text link
    Spatial transformer network has been used in a layered form in conjunction with a convolutional network to enable the model to transform data spatially. In this paper, we propose a combined spatial transformer network (STN) and a Long Short-Term Memory network (LSTM) to classify digits in sequences formed by MINST elements. This LSTM-STN model has a top-down attention mechanism profit from LSTM layer, so that the STN layer can perform short-term independent elements for the statement in the process of spatial transformation, thus avoiding the distortion that may be caused when the entire sequence is spatially transformed. It also avoids the influence of this distortion on the subsequent classification process using convolutional neural networks and achieves a single digit error of 1.6\% compared with 2.2\% of Convolutional Neural Network with STN layer
    • …
    corecore