Search CORE

11,127 research outputs found

A Study of Sindhi Related and Arabic Script Adapted languages Recognition

Author: Bhatti Zeeshan
Hakro Dil Nawaz
Moja G. N.
Talib A. Z.
Publication venue
Publication date: 13/12/2014
Field of study

A large number of publications are available for the Optical Character Recognition (OCR). Significant researches, as well as articles are present for the Latin, Chinese and Japanese scripts. Arabic script is also one of mature script from OCR perspective. The adaptive languages which share Arabic script or its extended characters; still lacking the OCRs for their language. In this paper we present the efforts of researchers on Arabic and its related and adapted languages. This survey is organized in different sections, in which introduction is followed by properties of Sindhi Language. OCR process techniques and methods used by various researchers are presented. The last section is dedicated for future work and conclusion is also discussed.Comment: 11 pages, 8 Figures, Sindh Univ. Res. Jour. (Sci. Ser.

arXiv.org e-Print Archive

Cursive Multilingual Characters Recognition Based on Hard Geometric Features

Author: Harouni Majid
Rehman Amjad
Saba Tanzila
Publication venue
Publication date: 07/04/2019
Field of study

The cursive nature of multilingual characters segmentation and recognition of Arabic, Persian, Urdu languages have attracted researchers from academia and industry. However, despite several decades of research, still multilingual characters classification accuracy is not up to the mark. This paper presents an automated approach for multilingual characters segmentation and recognition. The proposed methodology explores character based on their geometric features. However, due to uncertainty and without dictionary support few characters are over-divided. To expand the productivity of the proposed methodology a BPN is prepared with countless division focuses for cursive multilingual characters. Prepared BPN separates off base portioned indicates effectively with rapid upgrade character acknowledgment precision. For reasonable examination, only benchmark dataset is utilized.Comment: 1

arXiv.org e-Print Archive

Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining

Author: Rehman Amjad
Publication venue
Publication date: 15/01/2021
Field of study

Online Arabic cursive character recognition is still a big challenge due to the existing complexities including Arabic cursive script styles, writing speed, writer mood and so forth. Due to these unavoidable constraints, the accuracy of online Arabic character's recognition is still low and retain space for improvement. In this research, an enhanced method of detecting the desired critical points from vertical and horizontal direction-length of handwriting stroke features of online Arabic script recognition is proposed. Each extracted stroke feature divides every isolated character into some meaningful pattern known as tokens. A minimum feature set is extracted from these tokens for classification of characters using a multilayer perceptron with a back-propagation learning algorithm and modified sigmoid function-based activation function. In this work, two milestones are achieved; firstly, attain a fixed number of tokens, secondly, minimize the number of the most repetitive tokens. For experiments, handwritten Arabic characters are selected from the OHASD benchmark dataset to test and evaluate the proposed method. The proposed method achieves an average accuracy of 98.6% comparable in state of art character recognition techniques.Comment: 16 page

arXiv.org e-Print Archive

Improving Accessibility of Archived Raster Dictionaries of Complex Script Languages

Author: Alam Sawood
Mehmood Fateh ud din B
Nelson Michael L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2014
Field of study

We propose an approach to index raster images of dictionary pages which in turn would require very little manual effort to enable direct access to the appropriate pages of the dictionary for lookup. Accessibility is further improved by feedback and crowdsourcing that enables highlighting of the specific location on the page where the lookup word is found, annotation, digitization, and fielded searching. This approach is equally applicable on simple scripts as well as complex writing systems. Using our proposed approach, we have built a Web application called "Dictionary Explorer" which supports word indexes in various languages and every language can have multiple dictionaries associated with it. Word lookup gives direct access to appropriate pages of all the dictionaries of that language simultaneously. The application has exploration features like searching, pagination, and navigating the word index through a tree-like interface. The application also supports feedback, annotation, and digitization features. Apart from the scanned images, "Dictionary Explorer" aggregates results from various sources and user contributions in Unicode. We have evaluated the time required for indexing dictionaries of different sizes and complexities in the Urdu language and examined various trade-offs in our implementation. Using our approach, a single person can make a dictionary of 1,000 pages searchable in less than an hour.Comment: 11 pages, 5 images, 2 codes, 1 tabl

arXiv.org e-Print Archive

Improved Dynamic Time Warping (DTW) Approach for Online Signature Verification

Author: Jaini Azhar Ahmad
Rehman Amjad
Sulong Ghazali
Publication venue
Publication date: 26/03/2019
Field of study

Online signature verification is the process of verifying time series signature data which is generally obtained from the tablet-based device. Unlike offline signature images, the online signature image data consists of points that are arranged in a sequence of time. The aim of this research is to develop an improved approach to map the strokes in both test and reference signatures. Current methods make use of the Dynamic Time Warping (DTW) algorithm and its variant to segment them before comparing each of its data dimension. This paper presents a modified DTW algorithm with the proposed Lost Box Recovery Algorithm aims to improve the mapping performance for online signature verificationComment: This paper is first author thesis pape

arXiv.org e-Print Archive

Arabic Text Watermarking: A Review

Author: Alotaibi Reem Ahmed
Elrefaei Lamiaa A.
Publication venue
Publication date: 01/07/2015
Field of study

The using of the internet with its technologies and applications have been increased rapidly. So, protecting the text from illegal use is too needed . Text watermarking is used for this purpose. Arabic text has many characteristics such existing of diacritics , kashida (extension character) and points above or under its letters .Each of Arabic letters can take different shapes with different Unicode. These characteristics are utilized in the watermarking process. In this paper, several methods are discussed in the area of Arabic text watermarking with its advantages and disadvantages .Comparison of these methods is done in term of capacity, robustness and Imperceptibility.Comment: 16 pages, 4 tables and 19 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Cursive Overlapped Character Segmentation: An Enhanced Approach

Author: Rehman Amjad
Publication venue
Publication date: 23/03/2019
Field of study

Segmentation of highly slanted and horizontally overlapped characters is a challenging research area that is still fresh. Several techniques are reported in the state of art, but produce low accuracy for the highly slanted characters segmentation and cause overall low handwriting recognition precision. Accordingly, this paper presents a simple yet effective approach for character segmentation of such difficult slanted cursive words without using any slant correction technique. Rather a new concept of core-zone is introduced for segmenting such difficult slanted handwritten words. However, due to the inherent nature of cursive words, few characters are over-segmented and therefore, a threshold is selected heuristically to overcome this problem. For fair comparison, difficult words are extracted from the IAM benchmark database. Experiments thus performed exhibit promising result and high speed.Comment: 10 Page

arXiv.org e-Print Archive

Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm

Author: Nagabhushan P
R Amarnath
Publication venue: 'International Journal of Intelligent Systems and Applications in Engineering'
Publication date: 03/01/2019
Field of study

In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The tunneling algorithm uses a single agent (or robot) to identify the coordinate positions in the compressed representation to perform text-line segmentation of the document. The agent starts at a source point and progressively tunnels a path routing in between two adjacent text lines and reaches the probable target. The agent's navigation path from source to the target bypassing obstacles, if any, results in segregating the two adjacent text lines. However, the target point would be known only when the agent reaches the destination; this is applicable for all source points and henceforth we could analyze the correspondence between source and target nodes. Artificial Intelligence in Expert systems, dynamic programming and greedy strategies are employed for every search space while tunneling. An exhaustive experimentation is carried out on various benchmark datasets including ICDAR13 and the performances are reported.Comment: Compressed Representation, Handwritten Document Image, Text-Line Terminal Point, Text-Line Segmentation, Search Space, Gri

arXiv.org e-Print Archive

A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts

Author: Azmi Aini Najwa
Nasien Dewi
Shamsuddin Siti Mariyam
Publication venue
Publication date: 22/08/2013
Field of study

There are a lot of intensive researches on handwritten character recognition (HCR) for almost past four decades. The research has been done on some of popular scripts such as Roman, Arabic, Chinese and Indian. In this paper we present a review on HCR work on the four popular scripts. We have summarized most of the published paper from 2005 to recent and also analyzed the various methods in creating a robust HCR system. We also added some future direction of research on HCR.Comment: 8 page

arXiv.org e-Print Archive

Online Decision Process based on Machine Learning Techniques

Author: Saba Tanzila
Publication venue
Publication date: 15/01/2021
Field of study

This paper analyses role of internet in marketing and its influences on business decision-making process. It explains how the decision maker collect variety of information about customers through internet and analysis this data to better use it in enhancing the processes and the overall performance of the organization. In addition, how each department in an organization collaborates and use these information through data warehousing. Accordingly, a business intelligence model is proposed for web segmentation that divides potential markets or consumers into specific groups and analysis them for better decision making. The model further plans to push the significance of web opportunities in directing the web division and gathering client information. It is exhibited how marketing information system include customers, equipment and procedures analysis contribute to help decision makers make better decision

arXiv.org e-Print Archive