18 research outputs found

    Unconstrained Scene Text and Video Text Recognition for Arabic Script

    Full text link
    Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesising millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNNs to model contextual dependencies yields superior recognition results.Comment: 5 page

    Adaptation de modèles de Markov cachés - Application à la reconnaissance de caractères imprimés

    Get PDF
    International audienceWe present in this paper a new algorithm for the adaptation of hidden Markov models (HMM models). The principle of our iterative adaptive algorithm is to alternate an HMM structure adaptation stage with an HMM Gaussian MAP adaptation stage. This algorithm is applied to the recognition of printed characters to adapt the models learned by a polyfont character recognition engine to new forms of characters. Comparing the results with those of MAP and MLLR classic adaptations shows a slight increase in the performance of the recognition system

    Multi-Character Field Recognition for Arabic and Chinese Handwriting

    Get PDF
    Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript in both the query and the references. SIC implicitly incorporates language context in the form of letter n-grams. SCC is based on the notion that the style (distortion or noise) of a character is a good predictor of the distortions arising in other characters, even of a different class, from the same source. It is adaptive in the sense that with a long-enough field, its accuracy converges to that of a style-specific classifier trained on the writer of the unknown query. Neither SIC nor SCC requires the query words to appear among the references

    Multi-Character Field Recognition for Arabic and Chinese Handwriting

    Get PDF
    Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript in both the query and the references. SIC implicitly incorporates language context in the form of letter n-grams. SCC is based on the notion that the style (distortion or noise) of a character is a good predictor of the distortions arising in other characters, even of a different class, from the same source. It is adaptive in the sense that with a long-enough field, its accuracy converges to that of a style-specific classifier trained on the writer of the unknown query. Neither SIC nor SCC requires the query words to appear among the references

    Effect of Ghost Character Theory on Arabic Script Based Languages Character Recognition

    Get PDF
    International audienceArabic script is used by more than 1/4th population of the world in the form of different languages like Arabic, Persian, Urdu, Sindhi, Pashto etc but each language have its own words meaning. The set of شhas 58 alphabets. Arabic script based languages character recognition is difficult task due to complexities involved in this script not exist in other script. The analysis of the Arabic script is very complicated due to its use of diacritical marks associated with each character and written in many fonts and style. This script has gain very less intention by the researcher. This paper present a novel technique named Ghost Character Recognition Theory that will helps to develop a Multilanguage character recognition system for Arabic script based languages based on Ghost Character Theory. The main benefit of proposed approach is that it will works for all Arabic script based languages by doing effort for ghost character (basic skeleton) and developing dictionary for every language. By handling all Arabic script based languages many issues will arise like recognition rate as compared to system for specific languages, but in general it is not big issue for multilingual system and at the end we will get multilingual character recognition system

    Non-english and non-latin signature verification systems: A survey

    Full text link
    Signatures continue to be an important biometric because they remain widely used as a means of personal verification and therefore an automatic verification system is needed. Manual signature-based authentication of a large number of documents is a difficult and time consuming task. Consequently for many years, in the field of protected communication and financial applications, we have observed an explosive growth in biometric personal authentication systems that are closely connected with measurable unique physical characteristics (e.g. hand geometry, iris scan, finger prints or DNA) or behavioural features. Substantial research has been undertaken in the field of signature verification involving English signatures, but to the best of our knowledge, very few works have considered non-English signatures such as Chinese, Japanese, Arabic etc. In order to convey the state-of-the-art in the field to researchers, in this paper we present a survey of non-English and non-Latin signature verification systems

    Structural Features Extraction for Handwritten Arabic Personal Names Recognition

    Get PDF
    International audienceDue to the nature of handwriting with high degree of variability and imprecision, obtaining features that represent words is a difficult task. In this research, a features extraction method for handwritten Arabic word recognition is investigated. Its major goal is to maximize the recognition rate with the least amount of elements. This method incorporates many characteristics of handwritten characters based on structural information (loops, stems, legs, diacritics). Experiments are performed on Arabic personal names extracted from registers of the national Tunisian archive and on some Tunisian city names of IFN-ENIT database. The obtained results presented are encouraging and open other perspectives in the domain of the features and classifiers selection of Arabic Handwritten word recognition
    corecore