66,727 research outputs found

    A New Approach for Text String Detection from Natural Scenes By Grouping & Partition

    Get PDF
    In this paper we have reviewed and analyzed different methods to find strings of characters from natural scene images. We have reviewed different techniques like extraction of character string regions from scenery images based on contours and thickness of characters, efficient binarization and enhancement technique followed by a suitable connected component analysis procedure, text string detection from natural scenes by structure - based partition and grouping, and a robust algorithm for text detection in images. It is assumed that characters have closed contours, and a character string consists of characters which lie on a straight line in most cases. Therefore, by extracting closed contours and searching neighbors of them, character string regions can be extracted; Image binarization successfully processed natural scene images having shadows, non - uniform illumination, low contrast and large signal - dependent noise. Connected component analysis is used to define the final binary images that mainly consist of text regions. One technique chooses the candidate text characters from connected components by gradient feature and color feature. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitte d text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region coveri ng all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi - scales. The proposed methods outperform the state - of - the - art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in no horizontal orientations

    Rotation-invariant features for multi-oriented text detection in natural images.

    Get PDF
    Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes

    Cascaded Segmentation-Detection Networks for Word-Level Text Spotting

    Full text link
    We introduce an algorithm for word-level text spotting that is able to accurately and reliably determine the bounding regions of individual words of text "in the wild". Our system is formed by the cascade of two convolutional neural networks. The first network is fully convolutional and is in charge of detecting areas containing text. This results in a very reliable but possibly inaccurate segmentation of the input image. The second network (inspired by the popular YOLO architecture) analyzes each segment produced in the first stage, and predicts oriented rectangular regions containing individual words. No post-processing (e.g. text line grouping) is necessary. With execution time of 450 ms for a 1000-by-560 image on a Titan X GPU, our system achieves the highest score to date among published algorithms on the ICDAR 2015 Incidental Scene Text dataset benchmark.Comment: 7 pages, 8 figure
    • …
    corecore