Search CORE

2,883 research outputs found

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

Author: Li Hui
Shen Chunhua
Wang Peng
Zhang Guyu
Publication venue
Publication date: 16/03/2019
Field of study

Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, to some extent, increase the difficulty in algorithm implementation and data collection. In this work, we propose an easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations. It is composed of a

31

-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module. Despite its simplicity, the proposed method is robust and achieves state-of-the-art performance on both regular and irregular scene text recognition benchmarks. Code is available at: https://tinyurl.com/ShowAttendReadComment: Accepted to Proc. AAAI Conference on Artificial Intelligence 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recovering Homography from Camera Captured Documents using Convolutional Neural Networks

Author: Dejan M. Petrović
Gerdt Müller
Marie Dahlström
Martin Lersch
Matti Siika-aho
Oskar Bengtsson
Piotr Chylenski
Svein Jarle Horn
Vincent G. H. Eijsink
Publication venue
Publication date: 01/01/2017
Field of study

Removing perspective distortion from hand held camera captured document images is one of the primitive tasks in document analysis, but unfortunately, no such method exists that can reliably remove the perspective distortion from document images automatically. In this paper, we propose a convolutional neural network based method for recovering homography from hand-held camera captured documents. Our proposed method works independent of document's underlying content and is trained end-to-end in a fully automatic way. Specifically, this paper makes following three contributions: Firstly, we introduce a large scale synthetic dataset for recovering homography from documents images captured under different geometric and photometric transformations; secondly, we show that a generic convolutional neural network based architecture can be successfully used for regressing the corners positions of documents captured under wild settings; thirdly, we show that L1 loss can be reliably used for corners regression. Our proposed method gives state-of-the-art performance on the tested datasets, and has potential to become an integral part of document analysis pipeline.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Brage NMBU

Crossref

Directory of Open Access Journals

VTT Research System

FigShare

AON: Towards Arbitrarily-Oriented Text Recognition

Author: Bai Fan
Cheng Zhanzhan
Niu Yi
Pu Shiliang
Xu Yangliu
Zhou Shuigeng
Publication venue
Publication date: 22/03/2018
Field of study

Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201

arXiv.org e-Print Archive

Crossref