1,791 research outputs found

    Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

    Full text link
    Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, to some extent, increase the difficulty in algorithm implementation and data collection. In this work, we propose an easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations. It is composed of a 3131-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module. Despite its simplicity, the proposed method is robust and achieves state-of-the-art performance on both regular and irregular scene text recognition benchmarks. Code is available at: https://tinyurl.com/ShowAttendReadComment: Accepted to Proc. AAAI Conference on Artificial Intelligence 201

    Mid-level Deep Pattern Mining

    Full text link
    Mid-level visual element discovery aims to find clusters of image patches that are both representative and discriminative. In this work, we study this problem from the prospective of pattern mining while relying on the recently popularized Convolutional Neural Networks (CNNs). Specifically, we find that for an image patch, activations extracted from the first fully-connected layer of CNNs have two appealing properties which enable its seamless integration with pattern mining. Patterns are then discovered from a large number of CNN activations of image patches through the well-known association rule mining. When we retrieve and visualize image patches with the same pattern, surprisingly, they are not only visually similar but also semantically consistent. We apply our approach to scene and object classification tasks, and demonstrate that our approach outperforms all previous works on mid-level visual element discovery by a sizeable margin with far fewer elements being used. Our approach also outperforms or matches recent works using CNN for these tasks. Source code of the complete system is available online.Comment: Published in Proc. IEEE Conf. Computer Vision and Pattern Recognition 201

    Inhomogeneous states with checkerboard order in the t-J Model

    Full text link
    We study inhomogeneous states in the t-J model using an unrestricted Gutzwiller approximation. We find that pa×papa\times pa checkerboard order, where pp is a doping dependent number, emerges from Fermi surface instabilities of both the staggered flux phase and the Fermi liquid state with realistic band parameters. In both cases, the checkerboard order develops at wave vectors (±2π/pa,0)(\pm 2\pi/pa,0), (0,±2π/pa)(0,\pm2\pi/pa) that are tied to the peaks of the wave-vector dependent susceptibility, and is of the Lomer-Rice-Scott type. The properties of such periodic, inhomogeneous states are discussed in connection to the checkerboard patterns observed by STM in underdoped cuprates.Comment: Published Versio
    • …
    corecore