Search CORE

13 research outputs found

Kannada Character Recognition System A Review

Author: Indira K.
Selvi S. Sethu
Publication venue
Publication date: 01/01/2009
Field of study

Intensive research has been done on optical character recognition ocr and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market, but most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of works on Indian language character recognition especially Kannada script among 12 major scripts in India. This paper presents a review of existing work on printed Kannada script and their results. The characteristics of Kannada script and Kannada Character Recognition System kcr are discussed in detail. Finally fusion at the classifier level is proposed to increase the recognition accuracy.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

A Lexicon of Connected Components for Arabic Optical Text Recognition

Author: Elarian Yousef
Idris Fayez
Publication venue
Publication date: 12/01/2011
Field of study

Arabic is a cursive script that lacks the ease of character segmentation. Hence, we suggest a unit that is discrete in nature, viz. the connected component, for Arabic text recognition. A lexicon listing valid Arabic connected components is necessary to any system that is to use such unit. Here, we produce and analyze a comprehensive lexicon of connected components. A lexicon can be extracted from corpora or synthesized from morphemes. We follow both approaches and merge their results. Besides, generation of a lexicon of connected components encompasses extra tokenization and point-normalization steps to make the size of the lexicon tractable. We produce a lexicon of surface-words, reduce it into a lexicon of connected components, and finally into a lexicon of point normalized connected components. The lexicon of point normalized connected components contains 684,743 entries, showing a percent decrease of 97.17% from the word-lexicon

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

A New Feature Extraction Method for TMNN-Based Arabic Character Classification

Author: AlBakoor Majida
Saeed Khalid
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 27/01/2012
Field of study

This paper describes a hybrid method of typewritten Arabic character recognition by Toeplitz Matrices and Neural Networks (TMNN) applying a new technique for feature selecting and data mining. The suggested algorithm reduces the NN input data to only the most significant and essential-for-classification points. Four items are determined to resemble the distribution percentage of the essential feature points in each part of the extracted character image. Feature points are detected depending on a designed algorithm for this aim. This algorithm is of high performance and is intelligent enough to define the most significant points which satisfy the sufficient conditions to recognize almost all written fonts of Arabic characters. The number of essential feature points is reduced by at least 88 %. Calculations and data size are then consequently decreased in a high percentage. The authors achieved a recognition rate of 97.61 %. The obtained results have proved high accuracy, high speed and powerful classification

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Applying Genetic Algorithm in Multi Language\u27s Characters Recognition

Author: Hanan Aljuaid
Publication venue: 'IntechOpen'
Publication date: 21/03/2012
Field of study

IntechOpen

Optical character recognition of printed Odia documents

Author: Mishra M M K
Publication venue
Publication date: 12/05/2014
Field of study

Optical Character Recognition (OCR) is a document image analysis method that involves the mechanical or electronic transformation of scanned or photographed images of typewritten or printed text into text that can be easily read by the computer. OCR has been become a very widespread area of interest and research because of its ability to narrow the reading ability gap between computers and humans and because it improves human machine interaction in many applications. Example applications include cheque verification, and a large variety of banking, business and data entry applications. The project involved skew correction of odia documents, line segmentation and eventual segmentation of odia characters. The project involved segmentation of a document into its constituent lines, then treating the line as one entity, it segmented the words. Now, once the words are segmented, the characters are extracted one by one. The algorithms used here stand true for all the devnagri scripts. Hence examples of telgu word segmentation is also done just to show as an proof of the applied algorithm

ethesis@nitr

Kurdish Optical Character Recognition

Author: Hassani Hossein
Yaseen Rasty
Publication venue: 'University of Kurdistan Hewler'
Publication date: 30/06/2018
Field of study

Currently, no offline tool is available for Optical Character Recognition (OCR) in Kurdish. Kurdish is spoken in different dialects and uses several scripts for writing. The Persian/Arabic script is widely used among these dialects. The Persian/Arabic script is written from Right to Left (RTL), it is cursive, and it uses unique diacritics. These features, particularly the last two, affect the segmentation stage in developing a Kurdish OCR. In this article, we introduce an enhanced character segmentation based method which addresses the mentioned characteristics. We applied the method to text-only images and tested the Kurdish OCR using documents of different fonts, font sizes, and image resolutions. The results of the experiments showed that the accuracy rate of character recognition of the proposed method was 90.82% on average

The Scientific Journals of the University of Kurdistan Hewlêr

Graph Modeling based Segmentation of Handwritten Arabic Text into Constituent Subwords

Author
Publication venue: 'MECS Publisher'
Publication date
Field of study

Crossref

Auditing Electronic Files of Quran Using Optical Character Recognition

Author: alaa fathala abdalazez hamda
آلاء فتح الله عبد العزيز حمدة
Publication venue: جامعة القدس
Publication date
Field of study

Al-Quds University Digital Repository