205 research outputs found

    Applying Genetic Algorithm in Multi Language\u27s Characters Recognition

    Get PDF

    Non-english and non-latin signature verification systems: A survey

    Full text link
    Signatures continue to be an important biometric because they remain widely used as a means of personal verification and therefore an automatic verification system is needed. Manual signature-based authentication of a large number of documents is a difficult and time consuming task. Consequently for many years, in the field of protected communication and financial applications, we have observed an explosive growth in biometric personal authentication systems that are closely connected with measurable unique physical characteristics (e.g. hand geometry, iris scan, finger prints or DNA) or behavioural features. Substantial research has been undertaken in the field of signature verification involving English signatures, but to the best of our knowledge, very few works have considered non-English signatures such as Chinese, Japanese, Arabic etc. In order to convey the state-of-the-art in the field to researchers, in this paper we present a survey of non-English and non-Latin signature verification systems

    Online handwriting Arabic recognition system using k-nearest neighbors classifier and DCT features

    Get PDF
    With advances in machine learning techniques, handwriting recognition systems have gained a great deal of importance. Lately, the increasing popularity of handheld computers, digital notebooks, and smartphones give the field of online handwriting recognition more interest. In this paper, we propose an enhanced method for the recognition of Arabic handwriting words using a directions-based segmentation technique and discrete cosine transform (DCT) coefficients as structural features. The main contribution of this research was combining a total of 18 structural features which were extracted by DCT coefficients and using the k-nearest neighbors (KNN) classifier to classify the segmented characters based on the extracted features. A dataset is used to validate the proposed method consisting of 2500 words in total. The obtained average 99.10% accuracy in recognition of handwritten characters shows that the proposed approach, through its multiple phases, is efficient in separating, distinguishing, and classifying Arabic handwritten characters using the KNN classifier. The availability of an online dataset of Arabic handwriting words is the main issue in this field. However, the dataset used will be available for research via the website

    Biometrics Writer Recognition for Arabic language: Analysis and Classification techniques using Subwords Features

    Get PDF
    Handwritten text in any language is believed to convey a great deal of information about writers’ personality and identity. Indeed, handwritten signature has long been accepted as an authentication of the writer’s physical stamp on financial and legal deals as well official/personal documents and works of art. Handwritten documents are frequently used as evidences in forensic tasks. Handwriting skills is learnt and developed from the early schooling stages. Research interest in behavioral biometrics was the main driving force behind the growth in research into Writer Identification (WI) from handwritten text, but recent rise in terrorism associated with extreme religious ideologies spreading primarily, but not exclusively, from the middle-east has led to a surge of interest in WI from handwritten text in Arabic and similar languages. This thesis is the main outcome of extensive research investigations conducted with the aim of developing an automatic identification of a person from handwritten Arabic text samples. My motivations and interests, as an Iraqi researcher, emanate from my multi-faceted desires to provide scientific support for my people in their fight against terrorism by providing forensic evidences, and as contribute to the ongoing digitization of the Iraqi National archive as well as the wealth of religious and historical archives in Iraq and the middle-east. Good knowledge of the underlying language is invaluable in this project. Despite the rising interest in this recognition modality worldwide, Arabic writer identification has not been addressed as extensively as Latin writer identification. However, in recent years some new Arabic writer identification approaches have been proposed some of which are reviewed in this thesis. Arabic is a cursive language when handwritten. This means that each and every writer in this language develops some unique features that could demonstrate writer’s habits and style. These habits and styles are considered as unique WI features and determining factors. Existing dominating approaches to WI are based on recognizing handwriting habits/styles are embedded in certain parts/components of the written texts. Although the appearance of these components within long text contain rich information and clues to writer identity, the most common approaches to WI in Arabic in the literature are based on features extracted from paragraph(s), line(s), word(s), character(s), and/or a part of a character. Generally, Arabic words are made up of one or more subwords at the end of each; there is a connected stroke with a certain style of which seem to be most representative of writers habits. Another feature of Arabic writing is to do with diacritics that are added to written words/subwords, to add meaning and pronunciation. Subwords are more frequent in written Arabic text and appear as part of several different words or as full individual words. Thus, we propose a new innovative approach based on a seemingly plausible hypothesis that subwords based WI yields significant increase in accuracy over existing approaches. The thesis most significant contributions can be summarized as follows: - Developed a high performing segmentation of scanned text images, that combines threshold based binarisation, morphological operation and active shape model. - Defined digital measures and formed a 15-dimensional feature vectors representations of subwords that implicitly cover its diacritics and strokes. A pilot study that incrementally added features according to writer discriminating power. This reduced subwords feature vector dimension to 8, two of which were modelled as time series. - For the dependent 8-dimensional WI scheme, we identify the best performing set of subwords (best 22 subwords out of 49 then followed by best 11 out of these 22 subwords). - We established the validity of our hypothesis for different versions of subwords based WI schemes by providing empirical evidence when testing on a number of existing text dependent and in text-dependent databases plus a simulated text-in text-dependent DB. The text-dependent scenario results exhibited possible present of the Doddington Zoo phenomena. - The final optimal subword based WI scheme, not only removes the need to include diacritics as part of the subword but also demonstrating that including diacritics within subwords impairs the WI discriminating power of subwords. This should not be taken to discredit research that are based on diacritics based WI. Also in this subword body (without diacritics) base WI scheme, resulted in eliminating the presence of Doddington Zoo effect. - Finally, a significant but un-intended consequence of using subwords for WI is that there is no difference between a text-independent scenario and text-dependent one. In fact, we shall demonstrate that the text-dependent database of the 27-words can be used to simulate the testing of the scheme for an in text-dependent database without the need to record such a DB. Finally, we discussed ways of optimising the performance of our last scheme by considering possible ways of complementing our scheme using the addition of various image texture analysis features to be extracted from subwords, lines, paragraphs or entire file of the scabbed image. These included LBP and Gabor Filter. We also suggested the possible addition of few more features

    Recognition of handwritten Arabic characters

    Get PDF
    The subject of handwritten character recognition has been receiving considerable attention in recent years due to the increased dependence on computers. Several methods for recognizing Latin, Chinese as well as Kanji characters have been proposed. However, work on recognition of Arabic characters has been relatively sparse. Techniques developed for recognizing characters in other languages can not be used for Arabic since the nature of Arabic characters is different. The shape of a character is a function of its location within a word where each character can have two to four different forms. Most of the techniques proposed to date for recognizing Arabic characters have relied on structural and topographic approaches. This thesis introduces a decision-theoretic approach to solve the problem. The proposed method involves, as a first step, digitization of the segmented character. The secondary part of the character (dots and zigzags) are then isolated and identified separately thereby reducing the recognition issue to a 20 class problem or less for each of the character forms. The moments of the horizontal and vertical projections of the remaining primary characters are calculated and normalized with respect to the zero order moment. Simple measures of shape are obtained from the normalized moments and incorporated into a feature vector. Classification is accomplished using quadratic discriminant functions. The approach was evaluated using isolated, handwritten characters from a data base established for this purpose. The classification rates varied from 97.5% to 100% depending on the form of the characters. These results indicate that the technique offers significantly better classification rates in comparison with existing methods

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average
    • …
    corecore