36 research outputs found

    A Novel Dataset for English-Arabic Scene Text Recognition (EASTR)-42K and Its Evaluation Using Invariant Feature Extraction on Detected Extremal Regions

    Full text link
    © 2019 IEEE. The recognition of text in natural scene images is a practical yet challenging task due to the large variations in backgrounds, textures, fonts, and illumination. English as a secondary language is extensively used in Gulf countries along with Arabic script. Therefore, this paper introduces English-Arabic scene text recognition 42K scene text image dataset. The dataset includes text images appeared in English and Arabic scripts while maintaining the prime focus on Arabic script. The dataset can be employed for the evaluation of text segmentation and recognition task. To provide an insight to other researchers, experiments have been carried out on the segmentation and classification of Arabic as well as English text and report error rates like 5.99% and 2.48%, respectively. This paper presents a novel technique by using adapted maximally stable extremal region (MSER) technique and extracts scale-invariant features from MSER detected region. To select discriminant and comprehensive features, the size of invariant features is restricted and considered those specific features which exist in the extremal region. The adapted MDLSTM network is presented to tackle the complexities of cursive scene text. The research on Arabic scene text is in its infancy, thus this paper presents benchmark work in the field of text analysis

    Evaluation of handwritten Urdu text by integration of MNIST dataset learning experience

    Full text link
    © 2019 IEEE. The similar nature of patterns may enhance the learning if the experience they attained during training is utilized to achieve maximum accuracy. This paper presents a novel way to exploit the transfer learning experience of similar patterns on handwritten Urdu text analysis. The MNIST pre-trained network is employed by transferring it's learning experience on Urdu Nastaliq Handwritten Dataset (UNHD) samples. The convolutional neural network is used for feature extraction. The experiments were performed using deep multidimensional long short term (MDLSTM) memory networks. The obtained result shows immaculate performance on number of experiments distinguished on the basis of handwritten complexity. The result of demonstrated experiments show that pre-trained network outperforms on subsequent target networks which enable them to focus on a particular feature learning. The conducted experiments presented astonishingly good accuracy on UNHD dataset

    Hierarchy Composition GAN for High-fidelity Image Synthesis

    Full text link
    Despite the rapid progress of generative adversarial networks (GANs) in image synthesis in recent years, the existing image synthesis approaches work in either geometry domain or appearance domain alone which often introduces various synthesis artifacts. This paper presents an innovative Hierarchical Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network and achieves superior synthesis realism in both domains simultaneously. We design an innovative hierarchical composition mechanism that is capable of learning realistic composition geometry and handling occlusions while multiple foreground objects are involved in image composition. In addition, we introduce a novel attention mask mechanism that guides to adapt the appearance of foreground objects which also helps to provide better training reference for learning in geometry domain. Extensive experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure

    Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering

    Full text link
    The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size, stroke width, loops, ductus, slant angles, and cursive ligatures. Previous work on labeled data with Hidden Markov models, support vector machines, and semi-supervised recurrent neural networks have provided moderate to high success. In this study, we successfully detect hand shifts in a historical manuscript through fuzzy soft clustering in combination with linear principal component analysis. This advance demonstrates the successful deployment of unsupervised methods for writer attribution of historical documents and forensic document analysis.Comment: 26 pages in total, 5 figures and 2 table

    Arabic/Latin and Machine-printed/Handwritten Word Discrimination using HOG-based Shape Descriptor

    Get PDF
    In this paper, we present an approach for Arabic and Latin script and its type identification based onHistogram of Oriented Gradients (HOG) descriptors. HOGs are first applied at word level based on writingorientation analysis. Then, they are extended to word image partitions to capture fine and discriminativedetails. Pyramid HOG are also used to study their effects on different observation levels of the image.Finally, co-occurrence matrices of HOG are performed to consider spatial information between pairs ofpixels which is not taken into account in basic HOG. A genetic algorithm is applied to select the potentialinformative features combinations which maximizes the classification accuracy. The output is a relativelyshort descriptor that provides an effective input to a Bayes-based classifier. Experimental results on a set ofwords, extracted from standard databases, show that our identification system is robust and provides goodword script and type identification: 99.07% of words are correctly classified

    Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition

    Get PDF
    One of the most recent challenging issues of pattern recognition and artificial intelligence is Arabic text recognition. This research topic is still a pervasive and unaddressed research field, because of several factors. Complications arise due to the cursive nature of the Arabic writing, character similarities, unlimited vocabulary, use of multi-size and mixed-fonts, etc. To handle these challenges, an automatic Arabic text recognition requires building a robust system by computing discriminative features and applying a rigorous classifier together to achieve an improved performance. In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition. Our proposed system, termed BoF-deep SAE-HMM, is tested on four datasets, namely the printed Arabic line images Printed KHATT (P-KHATT), the benchmark printed word images Arabic Printed Text Image (APTI), the benchmark handwritten Arabic word images IFN/ENIT, and the benchmark handwritten digits images Modified National Institute of Standards and Technology (MNIST)

    Arabic/Latin and Machine-printed/Handwritten Word Discrimination using HOG-based Shape Descriptor

    Get PDF
    In this paper, we present an approach for Arabic and Latin script and its type identification based onHistogram of Oriented Gradients (HOG) descriptors. HOGs are first applied at word level based on writingorientation analysis. Then, they are extended to word image partitions to capture fine and discriminativedetails. Pyramid HOG are also used to study their effects on different observation levels of the image.Finally, co-occurrence matrices of HOG are performed to consider spatial information between pairs ofpixels which is not taken into account in basic HOG. A genetic algorithm is applied to select the potentialinformative features combinations which maximizes the classification accuracy. The output is a relativelyshort descriptor that provides an effective input to a Bayes-based classifier. Experimental results on a set ofwords, extracted from standard databases, show that our identification system is robust and provides goodword script and type identification: 99.07% of words are correctly classified
    corecore