2 research outputs found

    Boosting Scene Character Recognition by Learning Canonical Forms of Glyphs

    Full text link
    As one of the fundamental problems in document analysis, scene character recognition has attracted considerable interests in recent years. But the problem is still considered to be extremely challenging due to many uncontrollable factors including glyph transformation, blur, noisy background, uneven illumination, etc. In this paper, we propose a novel methodology for boosting scene character recognition by learning canonical forms of glyphs, based on the fact that characters appearing in scene images are all derived from their corresponding canonical forms. Our key observation is that more discriminative features can be learned by solving specially-designed generative tasks compared to traditional classification-based feature learning frameworks. Specifically, we design a GAN-based model to make the learned deep feature of a given scene character be capable of reconstructing corresponding glyphs in a number of standard font styles. In this manner, we obtain deep features for scene characters that are more discriminative in recognition and less sensitive against the above-mentioned factors. Our experiments conducted on several publicly-available databases demonstrate the superiority of our method compared to the state of the art.Comment: Accepted by International Journal on Document Analysis and Recognition (IJDAR), will appear in ICDAR 2019. Code: https://github.com/Actasidiot/CGR

    Exploring Font-independent Features for Scene Text Recognition

    Full text link
    Scene text recognition (STR) has been extensively studied in last few years. Many recently-proposed methods are specially designed to accommodate the arbitrary shape, layout and orientation of scene texts, but ignoring that various font (or writing) styles also pose severe challenges to STR. These methods, where font features and content features of characters are tangled, perform poorly in text recognition on scene images with texts in novel font styles. To address this problem, we explore font-independent features of scene texts via attentional generation of glyphs in a large number of font styles. Specifically, we introduce trainable font embeddings to shape the font styles of generated glyphs, with the image feature of scene text only representing its essential patterns. The generation process is directed by the spatial attention mechanism, which effectively copes with irregular texts and generates higher-quality glyphs than existing image-to-image translation methods. Experiments conducted on several STR benchmarks demonstrate the superiority of our method compared to the state of the art.Comment: accepted by ACM MM 2020. Code: https://actasidiot.github.io/EFIFSTR
    corecore