2 research outputs found
Adaptive Embedding Gate for Attention-Based Scene Text Recognition
Scene text recognition has attracted particular research interest because it
is a very challenging problem and has various applications. The most
cutting-edge methods are attentional encoder-decoder frameworks that learn the
alignment between the input image and output sequences. In particular, the
decoder recurrently outputs predictions, using the prediction of the previous
step as a guidance for every time step. In this study, we point out that the
inappropriate use of previous predictions in existing attention mechanisms
restricts the recognition performance and brings instability. To handle this
problem, we propose a novel module, namely adaptive embedding gate(AEG). The
proposed AEG focuses on introducing high-order character language models to
attention mechanism by controlling the information transmission between
adjacent characters. AEG is a flexible module and can be easily integrated into
the state-of-the-art attentional methods. We evaluate its effectiveness as well
as robustness on a number of standard benchmarks, including the IIITK, SVT,
SVT-P, CUTE, and ICDAR datasets. Experimental results demonstrate that AEG
can significantly boost recognition performance and bring better robustness
Text Recognition in the Wild: A Survey
The history of text can be traced back over thousands of years. Rich and
precise semantic information carried by text is important in a wide range of
vision-based application scenarios. Therefore, text recognition in natural
scenes has been an active research field in computer vision and pattern
recognition. In recent years, with the rise and development of deep learning,
numerous methods have shown promising in terms of innovation, practicality, and
efficiency. This paper aims to (1) summarize the fundamental problems and the
state-of-the-art associated with scene text recognition; (2) introduce new
insights and ideas; (3) provide a comprehensive review of publicly available
resources; (4) point out directions for future work. In summary, this
literature review attempts to present the entire picture of the field of scene
text recognition. It provides a comprehensive reference for people entering
this field, and could be helpful to inspire future research. Related resources
are available at our Github repository:
https://github.com/HCIILAB/Scene-Text-Recognition.Comment: Accepted by ACM Computing Survey