13 research outputs found

    Offline arabic character recognition using genetic approach

    Get PDF
    Many optical character recognition (OCR) techniques and tools have been developed for plurality of languages. A successful OCR system improves interactivity between humans and computers in many applications such as digitising and recognising written content. With regard to Arabic OCR, the problem of handwriting recognition is challenging because Arabic letters are cursive and shapechangeable depending on their positions. OCR systems have reached nearly perfect acknowledgement of Arabic printed text, yet still in its inception and needs to be greatly improved with handwritten text. Therefore in this study, an approach to recognize Arabic characters based on genetic algorithms (GA) is proposed. The approach requires two separate stages; feature extraction and GA for character recognition development. In the feature extraction stage, six features are detected for each character and denoted as a feature vector of 6 integer numbers. The feature vectors are then utilised in the next stage. Three genetic operators namely selection, crossover and mutation are implemented to search for the similar vectors with the best fitness value to recognise the character. The data used in this study were collected from different resources and stored in a database. It consists of 12,500 printed text words in 50 paragraphs and 15,000 words written by 100 different writers, males and females aged 5 to 60 years. Pre-processing operations are conducted including segmenting paragraphs into lines, segmenting line into words, segmenting words into characters, detecting skeleton, and determining baseline and other horizontal zones. The experimental results have shown that the proposed method has achieved promising accuracy recognition rate with 90.46% for printed text and handwritten characters

    Applying Genetic Algorithm in Multi Language\u27s Characters Recognition

    Get PDF

    Arabic Printed Word Recognition Using Windowed Bernoulli HMMs

    Full text link
    [EN] Hidden Markov Models (HMMs) are now widely used for off-line text recognition in many languages and, in particular, Arabic. In previous work, we proposed to directly use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. The idea was to by-pass feature extraction and to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. More recently, we extended the column bit vectors by means of a sliding window of adequate width to better capture image context at each horizontal position of the word image. However, these models might have limited capability to properly model vertical image distortions. In this paper, we have considered three methods of window repositioning after window extraction to overcome this limitation. Each sliding window is translated (repositioned) to align its center to the center of mass. Using this approach, state-of-art results are reported on the Arabic Printed Text Recognition (APTI) database.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 287755. Also supported by the Spanish Government (Plan E, iTrans2 TIN2009-14511 and AECID 2011/2012 grant).Alkhoury, I.; Giménez Pastor, A.; Juan Císcar, A.; Andrés Ferrer, J. (2013). Arabic Printed Word Recognition Using Windowed Bernoulli HMMs. Lecture Notes in Computer Science. 8156:330-339. https://doi.org/10.1007/978-3-642-41181-6_34S3303398156Dehghan, M., et al.: Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognition 34(5), 1057–1065 (2001), http://www.sciencedirect.com/science/article/pii/S0031320300000510Giménez, A., Juan, A.: Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition. In: ICDAR 2009, Barcelona, Spain, pp. 896–900 (July 2009)Giménez, A., Khoury, I., Juan, A.: Windowed Bernoulli Mixture HMMs for Arabic Handwritten Word Recognition. In: ICFHR 2010, Kolkata, India, pp. 533–538 (November 2010)Grosicki, E., El Abed, H.: ICDAR 2009 Handwriting Recognition Competition. In: ICDAR 2009, Barcelona, Spain, pp. 1398–1402 (July 2009)Günter, S., et al.: HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components. Pattern Recognition 37, 2069–2079 (2004)Märgner, V., El Abed, H.: ICDAR 2007 - Arabic Handwriting Recognition Competition. In: ICDAR 2007, Curitiba, Brazil, pp. 1274–1278 (September 2007)Märgner, V., El Abed, H.: ICDAR 2009 Arabic Handwriting Recognition Competition. In: ICDAR 2009, Barcelona, Spain, pp. 1383–1387 (July 2009)Pechwitz, M., et al.: IFN/ENIT - database of handwritten Arabic words. In: CIFED 2002, Hammamet, Tunis, pp. 21–23 (October 2002)Rabiner, L., Juang, B.: Fundamentals of speech recognition. Prentice-Hall (1993)Slimane, F., et al.: A new arabic printed text image database and evaluation protocols. In: ICDAR 2009, pp. 946–950 (July 2009)Slimane, F., et al.: ICDAR 2011 - arabic recognition competition: Multi-font multi-size digitally represented text. In: ICDAR 2011 - Arabic Recognition Competition, pp. 1449–1453. IEEE (September 2011)Young, S.: et al.: The HTK Book. Cambridge University Engineering Department (1995

    ICDAR 2009 Arabic Handwriting Recognition Competition

    No full text
    This paper describes the Arabic handwriting recognition competition held at ICDAR 2009. This third competition (the first was at ICDAR 2005 and the second at ICDAR 2007) again used the IfN/ENIT-database with Arabic handwritten Tunisian town names. Today, more than 82 research groups from universities, research centers, and industry are working with this database worldwide. This year, 7 groups with 17 systems were participating in the competition. The systems were tested on known data and on two data sets which are unknown to the participants. The systems were compared based on the most important characteristic: the recognition rate. Additionally, the relative speed of the different systems was compared. A short description of the participating groups, their systems, and the results achieved are finally presented

    ICDAR 2009-Arabic handwriting recognition competition

    No full text

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    Writer Identification of Arabic Handwritten Documents

    Get PDF
    corecore