Search CORE

2 research outputs found

Adaptive Embedding Gate for Attention-Based Scene Text Recognition

Author: Chen Xiaoxue
Jin Lianwen
Luo Canjie
Wang Tianwei
Zhu Yuanzhi
Publication venue
Publication date: 26/08/2019
Field of study

Scene text recognition has attracted particular research interest because it is a very challenging problem and has various applications. The most cutting-edge methods are attentional encoder-decoder frameworks that learn the alignment between the input image and output sequences. In particular, the decoder recurrently outputs predictions, using the prediction of the previous step as a guidance for every time step. In this study, we point out that the inappropriate use of previous predictions in existing attention mechanisms restricts the recognition performance and brings instability. To handle this problem, we propose a novel module, namely adaptive embedding gate(AEG). The proposed AEG focuses on introducing high-order character language models to attention mechanism by controlling the information transmission between adjacent characters. AEG is a flexible module and can be easily integrated into the state-of-the-art attentional methods. We evaluate its effectiveness as well as robustness on a number of standard benchmarks, including the IIIT

5

K, SVT, SVT-P, CUTE

80

, and ICDAR datasets. Experimental results demonstrate that AEG can significantly boost recognition performance and bring better robustness

arXiv.org e-Print Archive

Text Recognition in the Wild: A Survey

Author: Chen Xiaoxue
Jin Lianwen
Luo Canjie
Wang Tianwei
Zhu Yuanzhi
Publication venue
Publication date: 03/12/2020
Field of study

The history of text can be traced back over thousands of years. Rich and precise semantic information carried by text is important in a wide range of vision-based application scenarios. Therefore, text recognition in natural scenes has been an active research field in computer vision and pattern recognition. In recent years, with the rise and development of deep learning, numerous methods have shown promising in terms of innovation, practicality, and efficiency. This paper aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition; (2) introduce new insights and ideas; (3) provide a comprehensive review of publicly available resources; (4) point out directions for future work. In summary, this literature review attempts to present the entire picture of the field of scene text recognition. It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research. Related resources are available at our Github repository: https://github.com/HCIILAB/Scene-Text-Recognition.Comment: Accepted by ACM Computing Survey

arXiv.org e-Print Archive