11,691 research outputs found
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
In this paper, we introduce a novel end-end framework for multi-oriented
scene text detection from an instance-aware semantic segmentation perspective.
We present Fused Text Segmentation Networks, which combine multi-level features
during the feature extracting as text instance may rely on finer feature
expression compared to general objects. It detects and segments the text
instance jointly and simultaneously, leveraging merits from both semantic
segmentation task and region proposal based object detection task. Not
involving any extra pipelines, our approach surpasses the current state of the
art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental
Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever,
we report a baseline on total-text containing curved text which suggests
effectiveness of the proposed approach.Comment: Accepted by ICPR201
Detecting Oriented Text in Natural Images by Linking Segments
Most state-of-the-art text detection methods are specific to horizontal Latin
text and are not fast enough for real-time applications. We introduce Segment
Linking (SegLink), an oriented text detection method. The main idea is to
decompose text into two locally detectable elements, namely segments and links.
A segment is an oriented box covering a part of a word or text line; A link
connects two adjacent segments, indicating that they belong to the same word or
text line. Both elements are detected densely at multiple scales by an
end-to-end trained, fully-convolutional neural network. Final detections are
produced by combining segments connected by links. Compared with previous
methods, SegLink improves along the dimensions of accuracy, speed, and ease of
training. It achieves an f-measure of 75.0% on the standard ICDAR 2015
Incidental (Challenge 4) benchmark, outperforming the previous best by a large
margin. It runs at over 20 FPS on 512x512 images. Moreover, without
modification, SegLink is able to detect long lines of non-Latin text, such as
Chinese.Comment: To Appear in CVPR 201
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting
We introduce an algorithm for word-level text spotting that is able to
accurately and reliably determine the bounding regions of individual words of
text "in the wild". Our system is formed by the cascade of two convolutional
neural networks. The first network is fully convolutional and is in charge of
detecting areas containing text. This results in a very reliable but possibly
inaccurate segmentation of the input image. The second network (inspired by the
popular YOLO architecture) analyzes each segment produced in the first stage,
and predicts oriented rectangular regions containing individual words. No
post-processing (e.g. text line grouping) is necessary. With execution time of
450 ms for a 1000-by-560 image on a Titan X GPU, our system achieves the
highest score to date among published algorithms on the ICDAR 2015 Incidental
Scene Text dataset benchmark.Comment: 7 pages, 8 figure
- …