2 research outputs found
Text extraction in natural scenes using region-based method
Text in images is a very important clue for image indexing and retrieving. Unfortunately, it is a challenging work to accurately and robustly extract text from a complex background image. In this paper, a novel region-based text extraction method is proposed. In doing so, the candidate text regions are detected by 8-connected objects detection algorithm based on the edge image. Then the non-text regions are filtered out using shape, texture and stroke width rules. Finally, the remaining regions are grouped into text lines. Since stroke width is the intrinsic and particular characteristics of the text, the accuracy of the non-text filter are notably promoted. The improved Stroke Width Transform in the paper is less computing complexities and more accurate. Experimental results on sample ICDAR competition Dataset and our dataset show that the proposed method has the best performance compared with other five methods
An end-to-end, interactive Deep Learning based Annotation system for cursive and print English handwritten text
With the surging inclination towards carrying out tasks on computational
devices and digital mediums, any method that converts a task that was
previously carried out manually, to a digitized version, is always welcome.
Irrespective of the various documentation tasks that can be done online today,
there are still many applications and domains where handwritten text is
inevitable, which makes the digitization of handwritten documents a very
essential task. Over the past decades, there has been extensive research on
offline handwritten text recognition. In the recent past, most of these
attempts have shifted to Machine learning and Deep learning based approaches.
In order to design more complex and deeper networks, and ensure stellar
performances, it is essential to have larger quantities of annotated data. Most
of the databases present for offline handwritten text recognition today, have
either been manually annotated or semi automatically annotated with a lot of
manual involvement. These processes are very time consuming and prone to human
errors. To tackle this problem, we present an innovative, complete end-to-end
pipeline, that annotates offline handwritten manuscripts written in both print
and cursive English, using Deep Learning and User Interaction techniques. This
novel method, which involves an architectural combination of a detection system
built upon a state-of-the-art text detection model, and a custom made Deep
Learning model for the recognition system, is combined with an easy-to-use
interactive interface, aiming to improve the accuracy of the detection,
segmentation, serialization and recognition phases, in order to ensure high
quality annotated data with minimal human interaction.Comment: 17 pages, 8 figures, 2 table