20,631 research outputs found
AON: Towards Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is a hot research topic in computer
vision due to its various applications. Despite the enduring research of
several decades on optical character recognition (OCR), recognizing texts from
natural images is still a challenging task. This is because scene texts are
often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted)
arrangements, which have not yet been well addressed in the literature.
Existing methods on text recognition mainly work with regular (horizontal and
frontal) texts and cannot be trivially generalized to handle irregular texts.
In this paper, we develop the arbitrary orientation network (AON) to directly
capture the deep features of irregular texts, which are combined into an
attention-based decoder to generate character sequence. The whole network can
be trained end-to-end by using only images and word-level annotations.
Extensive experiments on various benchmarks, including the CUTE80,
SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed
AON-based method achieves the-state-of-the-art performance in irregular
datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
In this paper, we introduce a novel end-end framework for multi-oriented
scene text detection from an instance-aware semantic segmentation perspective.
We present Fused Text Segmentation Networks, which combine multi-level features
during the feature extracting as text instance may rely on finer feature
expression compared to general objects. It detects and segments the text
instance jointly and simultaneously, leveraging merits from both semantic
segmentation task and region proposal based object detection task. Not
involving any extra pipelines, our approach surpasses the current state of the
art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental
Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever,
we report a baseline on total-text containing curved text which suggests
effectiveness of the proposed approach.Comment: Accepted by ICPR201
- …