5 research outputs found

    New Moments Based Fuzzy Similarity Measure for Text Detection in Distorted Social Media Images

    Full text link
    A trend towards capturing or filming images using cellphone and sharing images on social media is a part and parcel of day to day activities of humans. When an image is forwarded several times in social media it may be distorted a lot due to several different devices. This work deals with text detection from such distorted images. In this work, we consider images pass through three mobile devices on WhatsApp social media, which results in four images (including the original image) Unlike the existing methods that aim at developing new ways, we utilize the results detected by the existing ones to improve performances. The proposed method extracts Hu moments and fuzzy logic from detected texts of images. The similarity between text detection results given by three existing text detection methods is studied for determining the best pair of texts. The same similarity estimation is then used in a novel way to remove extra background or non-texts and restoring missing text information. Experimental results on own dataset and benchmark datasets of natural scene images, namely, MSRA-TD500, ICDAR2017-MLT, Total-Text, CTW1500 dataset and COCO datasets, show that the proposed method outperforms the existing methods

    A new context-based method for restoring occluded text in natural scene images

    Full text link
    Text recognition from natural scene images is an active research area because of its important real world applications, including multimedia search and retrieval, and scene understanding through computer vision. It is often the case that portions of text in images are missed due to occlusion with objects in the background. Therefore, this paper presents a method for restoring occluded text to improve text recognition performance. The proposed method uses the GOOGLE Vision API for obtaining labels for input images. We propose to use PixelLink-E2E methods for detecting text and obtaining recognition results. Using these results, the proposed method generates candidate words based on distance measures employing lexicons created through natural scene text recognition. We extract the semantic similarity between labels and recognition results, which results in a Global Context Score (GCS). Next, we use the Natural Language Processing (NLP) system known as BERT for extracting semantics between candidate words, which results in a Local Context Score (LCS). Global and local context scores are then fused for estimating the ranking for each candidate word. The word that gets the highest ranking is taken as the correction for text which is occluded in the image. Experimental results on a dataset assembled from standard natural scene datasets and our resources show that our approach helps to improve the text recognition performance significantly

    Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images

    No full text
    Achieving good text detection and recognition results for multi-script-oriented images is a challenging task. First, we explore bit plane slicing in order to utilize the advantage of the most significant bit information to identify text components. A new iterative nearest neighbor symmetry is then proposed based on shapes of convex and concave deficiencies of text components in bit planes to identify candidate planes. Further, we introduce a new concept called mutual nearest neighbor pair components based on gradient direction to identify representative pairs of texts in each candidate bit plane. The representative pairs are used to restore words with the help of edge image of the input one, which results in text detection results (words). Second, we propose a new idea by fixing window for character components of arbitrary oriented words based on angular relationship between sub-bands and a fused band. For each window, we extract features in contourlet wavelet domain to detect characters with the help of an SVM classifier. Further, we propose to explore HMM for recognizing characters and words of any orientation using the same feature vector. The proposed method is evaluated on standard databases such as ICDAR, YVT video, ICDAR, SVT, MSRA scene data, ICDAR born digital data, and multi-lingual data to show its superiority to the state of the art methods. © 1991-2012 IEEE

    Multi-script-oriented text detection and recognition in video/scene/born digital images

    No full text
    Achieving good text detection and recognition results for multi-script-oriented images is a challenging task. First, we explore bit plane slicing in order to utilize the advantage of the most significant bit information to identify text components. A new iterative nearest neighbor symmetry is then proposed based on shapes of convex and concave deficiencies of text components in bit planes to identify candidate planes. Further, we introduce a new concept called mutual nearest neighbor pair components based on gradient direction to identify representative pairs of texts in each candidate bit plane. The representative pairs are used to restore words with the help of edge image of the input one, which results in text detection results (words). Second, we propose a new idea by fixing window for character components of arbitrary oriented words based on angular relationship between sub-bands and a fused band. For each window, we extract features in contourlet wavelet domain to detect characters with the help of an SVM classifier. Further, we propose to explore HMM for recognizing characters and words of any orientation using the same feature vector. The proposed method is evaluated on standard databases such as ICDAR, YVT video, ICDAR, SVT, MSRA scene data, ICDAR born digital data, and multi-lingual data to show its superiority to the state of the art methods
    corecore