8 research outputs found
Towards Robust Curve Text Detection with Conditional Spatial Expansion
It is challenging to detect curve texts due to their irregular shapes and
varying sizes. In this paper, we first investigate the deficiency of the
existing curve detection methods and then propose a novel Conditional Spatial
Expansion (CSE) mechanism to improve the performance of curve text detection.
Instead of regarding the curve text detection as a polygon regression or a
segmentation problem, we treat it as a region expansion process. Our CSE starts
with a seed arbitrarily initialized within a text region and progressively
merges neighborhood regions based on the extracted local features by a CNN and
contextual information of merged regions. The CSE is highly parameterized and
can be seamlessly integrated into existing object detection frameworks.
Enhanced by the data-dependent CSE mechanism, our curve text detection system
provides robust instance-level text region extraction with minimal
post-processing. The analysis experiment shows that our CSE can handle texts
with various shapes, sizes, and orientations, and can effectively suppress the
false-positives coming from text-like textures or unexpected texts included in
the same RoI. Compared with the existing curve text detection algorithms, our
method is more robust and enjoys a simpler processing flow. It also creates a
new state-of-art performance on curve text benchmarks with F-score of up to
78.4.Comment: This paper has been accepted by IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR 2019
A Text Recognition Algorithm Based on a Dual-Attention Mechanism in Complex Driving Environment
In response to many problems such as complex background of text recognition environment, perspective distortion, shallow handwriting, and mixed Chinese and English characters, we have designed an OCR algorithm framework with features such as landmark extraction and correction, image enhancement, text detection, and text recognition. We have designed a DBNet based on dual attention mechanism and content-aware upsampling. We have also designed a text recognition module incorporating the central loss CRNN + CTC to improve content awareness. Experimental results show that the improved text detection network in this paper has increased accuracy by 5.09%, recall by 2.12%, and F-score by 3.46% on the ICDAR2015 dataset. The text recognition network has improved the accuracy of recognizing Chinese and English characters by 1.2%
Detection and Rectification of Arbitrary Shaped Scene Texts by using Text Keypoints and Links
Detection and recognition of scene texts of arbitrary shapes remain a grand
challenge due to the super-rich text shape variation in text line orientations,
lengths, curvatures, etc. This paper presents a mask-guided multi-task network
that detects and rectifies scene texts of arbitrary shapes reliably. Three
types of keypoints are detected which specify the centre line and so the shape
of text instances accurately. In addition, four types of keypoint links are
detected of which the horizontal links associate the detected keypoints of each
text instance and the vertical links predict a pair of landmark points (for
each keypoint) along the upper and lower text boundary, respectively. Scene
texts can be located and rectified by linking up the associated landmark points
(giving localization polygon boxes) and transforming the polygon boxes via thin
plate spline, respectively. Extensive experiments over several public datasets
show that the use of text keypoints is tolerant to the variation in text
orientations, lengths, and curvatures, and it achieves superior scene text
detection and rectification performance as compared with state-of-the-art
methods