3,394 research outputs found

    Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks

    Full text link
    In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower performance than those developed for English alpha-numeric characters. To improve the performance of HBCR, we herein present the application of the state-of-the-art Deep Convolutional Neural Networks (DCNN) including VGG Network, All Convolution Network (All-Conv Net), Network in Network (NiN), Residual Network, FractalNet, and DenseNet for HBCR. The deep learning approaches have the advantage of extracting and using feature information, improving the recognition of 2D shapes with a high degree of invariance to translation, scaling and other distortions. We systematically evaluated the performance of DCNN models on publicly available Bangla handwritten character dataset called CMATERdb and achieved the superior recognition accuracy when using DCNN models. This improvement would help in building an automatic HBCR system for practical applications.Comment: 12 pages,22 figures, 5 tables. arXiv admin note: text overlap with arXiv:1705.0268

    Telugu OCR Framework using Deep Learning

    Full text link
    In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work

    Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network

    Full text link
    In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. The advantages of using character level data for training have been outlined in section I. Our method uses a multimodal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities jointly for script identification task. We take handwritten data in either modality as input and the opposite modality is generated through intermodality conversion. Thereafter, we feed this offline-online modality pair to our network. Hence, along with the advantage of utilizing information from both the modalities, it can work as a single framework for both offline and online script identification simultaneously which alleviates the need for designing two separate script identification modules for individual modality. One more major contribution is that we propose a novel conditional multimodal fusion scheme to combine the information from offline and online modality which takes into account the real origin of the data being fed to our network and thus it combines adaptively. An exhaustive experiment has been done on a data set consisting of English and six Indic scripts. Our proposed framework clearly outperforms different frameworks based on traditional classifiers along with handcrafted features and deep learning based methods with a clear margin. Extensive experiments show that using only character level training data can achieve state-of-art performance similar to that obtained with traditional training using word level data in our framework.Comment: Accepted in Information Fusion, Elsevie

    Large Scale Font Independent Urdu Text Recognition System

    Full text link
    OCR algorithms have received a significant improvement in performance recently, mainly due to the increase in the capabilities of artificial intelligence algorithms. However, this advancement is not evenly distributed over all languages. Urdu is among the languages which did not receive much attention, especially in the font independent perspective. There exists no automated system that can reliably recognize printed Urdu text in images and videos across different fonts. To help bridge this gap, we have developed Qaida, a large scale data set with 256 fonts, and a complete Urdu lexicon. We have also developed a Convolutional Neural Network (CNN) based classification model which can recognize Urdu ligatures with 84.2% accuracy. Moreover, we demonstrate that our recognition network can not only recognize the text in the fonts it is trained on but can also reliably recognize text in unseen (new) fonts. To this end, this paper makes following contributions: (i) we introduce a large scale, multiple fonts based data set for printed Urdu text recognition;(ii) we have designed, trained and evaluated a CNN based model for Urdu text recognition; (iii) we experiment with incremental learning methods to produce state-of-the-art results for Urdu text recognition. All the experiment choices were thoroughly validated via detailed empirical analysis. We believe that this study can serve as the basis for further improvement in the performance of font independent Urdu OCR systems

    Handwritten Bangla Digit Recognition Using Deep Learning

    Full text link
    In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the performance of Handwritten Bangla Digit Recognition (HBDR), we herein present a new approach based on deep neural networks which have recently shown excellent performance in many pattern recognition and machine learning applications, but has not been throughly attempted for HBDR. We introduce Bangla digit recognition techniques based on Deep Belief Network (DBN), Convolutional Neural Networks (CNN), CNN with dropout, CNN with dropout and Gaussian filters, and CNN with dropout and Gabor filters. These networks have the advantage of extracting and using feature information, improving the recognition of two dimensional shapes with a high degree of invariance to translation, scaling and other pattern distortions. We systematically evaluated the performance of our method on publicly available Bangla numeral image database named CMATERdb 3.1.1. From experiments, we achieved 98.78% recognition rate using the proposed method: CNN with Gabor features and dropout, which outperforms the state-of-the-art algorithms for HDBR.Comment: 12 pages, 10 figures, 3 table

    Unsupervised Feature Learning for Writer Identification and Writer Retrieval

    Full text link
    Deep Convolutional Neural Networks (CNN) have shown great success in supervised classification tasks such as character classification or dating. Deep learning methods typically need a lot of annotated training data, which is not available in many scenarios. In these cases, traditional methods are often better than or equivalent to deep learning methods. In this paper, we propose a simple, yet effective, way to learn CNN activation features in an unsupervised manner. Therefore, we train a deep residual network using surrogate classes. The surrogate classes are created by clustering the training dataset, where each cluster index represents one surrogate class. The activations from the penultimate CNN layer serve as features for subsequent classification tasks. We evaluate the feature representations on two publicly available datasets. The focus lies on the ICDAR17 competition dataset on historical document writer identification (Historical-WI). We show that the activation features trained without supervision are superior to descriptors of state-of-the-art writer identification methods. Additionally, we achieve comparable results in the case of handwriting classification using the ICFHR16 competition dataset on historical Latin script types (CLaMM16).Comment: ICDAR2017 camera ready (fixed p@2 values, missing table references

    Deep learning for word-level handwritten Indic script identification

    Full text link
    We propose a novel method that uses convolutional neural networks (CNNs) for feature extraction. Not just limited to conventional spatial domain representation, we use multilevel 2D discrete Haar wavelet transform, where image representations are scaled to a variety of different sizes. These are then used to train different CNNs to select features. To be precise, we use 10 different CNNs that select a set of 10240 features, i.e. 1024/CNN. With this, 11 different handwritten scripts are identified, where 1K words per script are used. In our test, we have achieved the maximum script identification rate of 94.73% using multi-layer perceptron (MLP). Our results outperform the state-of-the-art techniques.Comment: 11 pages, 6 figures , 2 table

    PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents

    Full text link
    In recent years, deep convolutional neural networks have achieved state of the art performance in various computer vision task such as classification, detection or segmentation. Due to their outstanding performance, CNNs are more and more used in the field of document image analysis as well. In this work, we present a CNN architecture that is trained with the recently proposed PHOC representation. We show empirically that our CNN architecture is able to outperform state of the art results for various word spotting benchmarks while exhibiting short training and test times.Comment: published as conference paper at the International Conference on Frontiers in Handwriting Recognition 201

    Improving patch-based scene text script identification with ensembles of conjoined networks

    Full text link
    This paper focuses on the problem of script identification in scene text images. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed aspect ratio as in the typical use of holistic CNN classifiers, we propose here a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme. Our experiments with this learning procedure demonstrate state-of-the-art results in two public script identification datasets. In addition, we propose a new public benchmark dataset for the evaluation of multi-lingual scene text end-to-end reading systems. Experiments done in this dataset demonstrate the key role of script identification in a complete end-to-end system that combines our script identification method with a previously published text detector and an off-the-shelf OCR engine
    • …