Search CORE

30,396 research outputs found

A Review of Research on Devnagari Character Recognition

Author: Dongre V J
Mankar V H
Publication venue
Publication date: 12/01/2011
Field of study

English Character Recognition (CR) has been extensively studied in the last half century and progressed to a level, sufficient to produce technology driven applications. But same is not the case for Indian languages which are complicated in terms of structure and computations. Rapidly growing computational power may enable the implementation of Indic CR methodologies. Digital document processing is gaining popularity for application to office and library automation, bank and postal services, publishing houses and communication technology. Devnagari being the national language of India, spoken by more than 500 million people, should be given special attention so that document retrieval and analysis of rich ancient and modern Indian literature can be effectively done. This article is intended to serve as a guide and update for the readers, working in the Devnagari Optical Character Recognition (DOCR) area. An overview of DOCR systems is presented and the available DOCR techniques are reviewed. The current status of DOCR is discussed and directions for future research are suggested.Comment: 8 pages, 1 Figure, 8 Tables, Journal pape

arXiv.org e-Print Archive

CiteSeerX

Rapid Feature Extraction for Optical Character Recognition

Author: Amin M. Ashraful
Hossain M. Zahid
Yan Hong
Publication venue
Publication date: 01/06/2012
Field of study

Feature extraction is one of the fundamental problems of character recognition. The performance of character recognition system is depends on proper feature extraction and correct classifier selection. In this article, a rapid feature extraction method is proposed and named as Celled Projection (CP) that compute the projection of each section formed through partitioning an image. The recognition performance of the proposed method is compared with other widely used feature extraction methods that are intensively studied for many different scripts in literature. The experiments have been conducted using Bangla handwritten numerals along with three different well known classifiers which demonstrate comparable results including 94.12% recognition accuracy using celled projection.Comment: 5 pages, 1 figur

arXiv.org e-Print Archive

A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts

Author: Azmi Aini Najwa
Nasien Dewi
Shamsuddin Siti Mariyam
Publication venue
Publication date: 22/08/2013
Field of study

There are a lot of intensive researches on handwritten character recognition (HCR) for almost past four decades. The research has been done on some of popular scripts such as Roman, Arabic, Chinese and Indian. In this paper we present a review on HCR work on the four popular scripts. We have summarized most of the published paper from 2005 to recent and also analyzed the various methods in creating a robust HCR system. We also added some future direction of research on HCR.Comment: 8 page

arXiv.org e-Print Archive

Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding

Author: Bhattacharyya Avirup
Bhunia Ayan Kumar
Pal Umapada
Roy Partha Pratim
Publication venue
Publication date: 30/07/2018
Field of study

Retrieval of text information from natural scene images and video frames is a challenging task due to its inherent problems like complex character shapes, low resolution, background noise, etc. Available OCR systems often fail to retrieve such information in scene/video frames. Keyword spotting, an alternative way to retrieve information, performs efficient text searching in such scenarios. However, current word spotting techniques in scene/video images are script-specific and they are mainly developed for Latin script. This paper presents a novel word spotting framework using dynamic shape coding for text retrieval in natural scene image and video frames. The framework is designed to search query keyword from multiple scripts with the help of on-the-fly script-wise keyword generation for the corresponding script. We have used a two-stage word spotting approach using Hidden Markov Model (HMM) to detect the translated keyword in a given text line by identifying the script of the line. A novel unsupervised dynamic shape coding based scheme has been used to group similar shape characters to avoid confusion and to improve text alignment. Next, the hypotheses locations are verified to improve retrieval performance. To evaluate the proposed system for searching keyword from natural scene image and video frames, we have considered two popular Indic scripts such as Bangla (Bengali) and Devanagari along with English. Inspired by the zone-wise recognition approach in Indic scripts[1], zone-wise text information has been used to improve the traditional word spotting performance in Indic scripts. For our experiment, a dataset consisting of images of different scenes and video frames of English, Bangla and Devanagari scripts were considered. The results obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe

arXiv.org e-Print Archive

Handwritten digit Recognition using Support Vector Machine

Author: Sharma Anshuman
Publication venue
Publication date: 17/03/2012
Field of study

Handwritten Numeral recognition plays a vital role in postal automation services especially in countries like India where multiple languages and scripts are used Discrete Hidden Markov Model (HMM) and hybrid of Neural Network (NN) and HMM are popular methods in handwritten word recognition system. The hybrid system gives better recognition result due to better discrimination capability of the NN. A major problem in handwriting recognition is the huge variability and distortions of patterns. Elastic models based on local observations and dynamic programming such HMM are not efficient to absorb this variability. But their vision is local. But they cannot face to length variability and they are very sensitive to distortions. Then the SVM is used to estimate global correlations and classify the pattern. Support Vector Machine (SVM) is an alternative to NN. In Handwritten recognition, SVM gives a better recognition result. The aim of this paper is to develop an approach which improve the efficiency of handwritten recognition using artificial neural networkComment: 7 pag

arXiv.org e-Print Archive

Describing Colors, Textures and Shapes for Content Based Image Retrieval - A Survey

Author: Ahmad Jamil
Baik Sung Wook
Mehmood Irfan
Rho Seungmin
Sajjad Muhammad
Publication venue
Publication date: 24/02/2015
Field of study

Visual media has always been the most enjoyed way of communication. From the advent of television to the modern day hand held computers, we have witnessed the exponential growth of images around us. Undoubtedly it's a fact that they carry a lot of information in them which needs be utilized in an effective manner. Hence intense need has been felt to efficiently index and store large image collections for effective and on- demand retrieval. For this purpose low-level features extracted from the image contents like color, texture and shape has been used. Content based image retrieval systems employing these features has proven very successful. Image retrieval has promising applications in numerous fields and hence has motivated researchers all over the world. New and improved ways to represent visual content are being developed each day. Tremendous amount of research has been carried out in the last decade. In this paper we will present a detailed overview of some of the powerful color, texture and shape descriptors for content based image retrieval. A comparative analysis will also be carried out for providing an insight into outstanding challenges in this field

arXiv.org e-Print Archive

Classifier Fusion Method to Recognize Handwritten Kannada Numerals

Author: Karthik S.
Mamatha H. R.
Srikanta Murthy K.
Publication venue
Publication date: 01/01/2013
Field of study

Optical Character Recognition (OCR) is one of the important fields in image processing and pattern recognition domain. Handwritten character recognition has always been a challenging task. Only a little work can be traced towards the recognition of handwritten characters for the south Indian languages. Kannada is one such south Indian language which is also one of the official language of India. Accurate recognition of Kannada characters is a challenging task because of the high degree of similarity between the characters. Hence, good quality features are to be extracted and better classifiers are needed to improve the accuracy of the OCR for Kannada characters. This paper explores the effectiveness of feature extraction method like run length count (RLC) and directional chain code (DCC) for the recognition of handwritten Kannada numerals. In this paper, a classifier fusion method is implemented to improve the recognition rate. For the classifier fusion, we have considered K-nearest neighbour (KNN) and Linear classifier (LC). The novelty of this method is to achieve better accuracy with few features using classifier fusion approach. Proposed method achieves an average recognition rate of 96%.Comment: 6 pages having 3 tables and 9 figures. Published in ICECT 2012 conferenc

arXiv.org e-Print Archive

A Hybrid NN/HMM Modeling Technique for Online Arabic Handwriting Recognition

Author: ALIMI Adel M.
Boubaker Houcine
Kherallah Monji
Tagougui Najiba
Publication venue
Publication date: 02/01/2014
Field of study

In this work we propose a hybrid NN/HMM model for online Arabic handwriting recognition. The proposed system is based on Hidden Markov Models (HMMs) and Multi Layer Perceptron Neural Networks (MLPNNs). The input signal is segmented to continuous strokes called segments based on the Beta-Elliptical strategy by inspecting the extremum points of the curvilinear velocity profile. A neural network trained with segment level contextual information is used to extract class character probabilities. The output of this network is decoded by HMMs to provide character level recognition. In evaluations on the ADAB database, we achieved 96.4% character recognition accuracy that is statistically significantly important in comparison with character recognition accuracies obtained from state-of-the-art online Arabic systems.

arXiv.org e-Print Archive

Enhancing the retrieval performance by combing the texture and edge features

Author: Eisa Mohamed
Eletrebi Amira
Elhenawy Ebrahim
Publication venue
Publication date: 10/01/2013
Field of study

In this paper, anew algorithm which is based on geometrical moments and local binary patterns (LBP) for content based image retrieval (CBIR) is proposed. In geometrical moments, each vector is compared with the all other vectors for edge map generation. The same concept is utilized at LBP calculation which is generating nine LBP patterns from a given 3x3 pattern. Finally, nine LBP histograms are calculated which are used as a feature vector for image retrieval. Moments are important features used in recognition of different types of images. Two experiments have been carried out for proving the worth of our algorithm. The results after being investigated shows a significant improvement in terms of their evaluation measures as compared to LBP and other existing transform domain techniques.Comment: 7 pages,8 figures, one tabl

arXiv.org e-Print Archive

Multiple models of Bayesian networks applied to offline recognition of Arabic handwritten city names

Author: Ghanmy Nabil
jayech Khlifia
Mahjoub Mohamed Ali
Miled Ikram
Publication venue
Publication date: 18/01/2013
Field of study

In this paper we address the problem of offline Arabic handwriting word recognition. Off-line recognition of handwritten words is a difficult task due to the high variability and uncertainty of human writing. The majority of the recent systems are constrained by the size of the lexicon to deal with and the number of writers. In this paper, we propose an approach for multi-writers Arabic handwritten words recognition using multiple Bayesian networks. First, we cut the image in several blocks. For each block, we compute a vector of descriptors. Then, we use K-means to cluster the low-level features including Zernik and Hu moments. Finally, we apply four variants of Bayesian networks classifiers (Na\"ive Bayes, Tree Augmented Na\"ive Bayes (TAN), Forest Augmented Na\"ive Bayes (FAN) and DBN (dynamic bayesian network) to classify the whole image of tunisian city name. The results demonstrate FAN and DBN outperform good recognition ratesComment: arXiv admin note: substantial text overlap with arXiv:1204.167

arXiv.org e-Print Archive