111 research outputs found
A Study of Techniques and Challenges in Text Recognition Systems
The core system for Natural Language Processing (NLP) and digitalization is Text Recognition. These systems are critical in bridging the gaps in digitization produced by non-editable documents, as well as contributing to finance, health care, machine translation, digital libraries, and a variety of other fields. In addition, as a result of the pandemic, the amount of digital information in the education sector has increased, necessitating the deployment of text recognition systems to deal with it. Text Recognition systems worked on three different categories of text: (a) Machine Printed, (b) Offline Handwritten, and (c) Online Handwritten Texts. The major goal of this research is to examine the process of typewritten text recognition systems. The availability of historical documents and other traditional materials in many types of texts is another major challenge for convergence. Despite the fact that this research examines a variety of languages, the Gurmukhi language receives the most focus. This paper shows an analysis of all prior text recognition algorithms for the Gurmukhi language. In addition, work on degraded texts in various languages is evaluated based on accuracy and F-measure
A Neural Network-based Approach for the Machine Vision of Character Recognition
In this paper, an attempt is made to develop off-line recognition strategies for the isolated Handwritten English character(A to Z) and (0 to 9). Challenges in handwritten character recognition wholly lie in the variation and distortion of handwritten characters, since different people may use different style of handwritten, and direction to draw the same shape of the characters of their known script. The paper provides a review on the process of character recognition using neural network. Character recognition methods are listed under two main headlines. The Offline methods use the static images properties. The Offline methods are further divided into four methods, which are clustering, Feature Extraction, Pattern Matching and Artificial Neural Network. The Online methods are subdivided into k-NN classifier and direction based algorithm. Character preprocessing is used binarization, thresolding and segmentation method. Neural network based method improves the character recognition. The proposed method is based on the feed forward back propogation method to classify the characters. The ANN is trained using the Back Propogation algorithm. In the proposed system, English nue-merical letter is represented by binary numbers that are assume as input and fed to an ANN. Neural network followed by Back Propagation Algorithm which compromises Training
Recognition of Arabic handwritten words
Recognizing Arabic handwritten words is a difficult problem due to the deformations of different writing styles. Moreover, the cursive nature of the Arabic writing makes correct segmentation of characters an almost impossible task. While there are many sub systems in an Arabic words recognition system, in this work we develop a sub system to recognize Part of Arabic Words (PAW). We try to solve this problem using three different approaches, implicit segmentation and two variants of holistic approach. While Rothacker found similar conclusions while this work is being prepared, we report the difficulty in locating characters in PAW using Scale Invariant Feature Transforms under the first approach. In the second and third approaches, we use holistic approach to recognize PAW using Support Vector Machine (SVM) and Active Shape Models (ASM). While there are few works that use SVM to recognize PAW, they use a small dataset; we use a large dataset and a different set of features. We also explain the errors SVM and ASM make and propose some remedies to these errors as future work
A Comparative study of Arabic handwritten characters invariant feature
This paper is practically interested in the unchangeable feature of Arabic
handwritten character. It presents results of comparative study achieved on
certain features extraction techniques of handwritten character, based on Hough
transform, Fourier transform, Wavelet transform and Gabor Filter. Obtained
results show that Hough Transform and Gabor filter are insensible to the
rotation and translation, Fourier Transform is sensible to the rotation but
insensible to the translation, in contrast to Hough Transform and Gabor filter,
Wavelets Transform is sensitive to the rotation as well as to the translation
Handwritten and printed text separation in historical documents
Historical documents present many challenges for Optical Character Recognition Systems
(OCR), especially documents of poor quality containing handwritten annotations,
stamps, signatures, and historical fonts. As most OCRs recognize either machine-printed
or handwritten texts, printed and handwritten parts have to be separated before using
the respective recognition system. This thesis addresses the problem of segmentation of
handwritings and printings in historical Latin text documents. To alleviate the problem
of lack of data containing handwritten and machine-printed components located on the
same page or even overlapping each other as well as their pixel-wise annotations, the data
synthesis method proposed in [12] was applied and new datasets were generated. The
newly created images and their pixel-level labels were used to train Fully Convolutional
Model (FCN) introduced in [5]. The newly trained model has shown better results in the
separation of machine-printed and handwritten text in historical documents
Recommended from our members
A high level approach to Arabic sentence recognition
The aim of this work is to develop sentence recognition system inspired by the human reading process. Cognitive studies observed that the human tended to read a word as a whole at a time. He considers the global word shapes and uses contextual knowledge to infer and discriminate a word among other possible words. The sentence recognition system is a fully integrated system; a word level recogniser (baseline system) integrated with linguistic knowledge post-processing module. The presented baseline system is holistic word-based recognition approach characterised as probabilistic ranked task. The output of the system is multiple recognition hypotheses (N-best word lattice). The basic unit is the word rather than the character; it does not rely on any segmentation or require baseline detection. The considered linguistic knowledge to re-rank the output of the existing baseline system is the standard n-gram Statistical Language Models (SLMs). The candidates are re-ranked through exploiting phrase perplexity score. The system is an OCR system that depends on HMM models utilizing the HTK Toolkit. The baseline system supported by global transformation features extracted from binary word images. The adopted features' extraction technique is the block-based Discrete Cosine Transform (DCT) applied to the whole word image. Feature vectors extracted using block-based DCT with non-overlapping sub-block of size 8x8 pixels. The applied HMMs to the task are mono-model discrete one-dimensional HMMs (Bakis Model). A balanced actual scanned and synthetic database of word-image has been constructed to ensure an even distribution of word samples. The Arabic words are typewritten in five fonts having a size 14 points in a plain style. The statistical language models and lexicon words are extracted from The Holy Qur‟an. The systems are applied on word images with no overlap between the training and testing datasets. The actual scanned database is used to evaluate the word recogniser. The synthetic database is a large amount of data acquired for a reliable training of sentence recognition systems. This word recogniser evaluated in mono-font and multi-font contexts. The two types of word recogniser have been used to achieve a final recognition accuracy of99.30% and 73.47% in mono-font and multi-font, respectively. The achieved average accuracy by the sentence recogniser is 67.24% improved to 78.35% on average when using 5-gram post-processing. The complexity and accuracy of the post-processing module are evaluated and found that 4-gram is more suitable than 5-gram; it is much faster at an average improvement of 76.89%
Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding
Retrieval of text information from natural scene images and video frames is a
challenging task due to its inherent problems like complex character shapes,
low resolution, background noise, etc. Available OCR systems often fail to
retrieve such information in scene/video frames. Keyword spotting, an
alternative way to retrieve information, performs efficient text searching in
such scenarios. However, current word spotting techniques in scene/video images
are script-specific and they are mainly developed for Latin script. This paper
presents a novel word spotting framework using dynamic shape coding for text
retrieval in natural scene image and video frames. The framework is designed to
search query keyword from multiple scripts with the help of on-the-fly
script-wise keyword generation for the corresponding script. We have used a
two-stage word spotting approach using Hidden Markov Model (HMM) to detect the
translated keyword in a given text line by identifying the script of the line.
A novel unsupervised dynamic shape coding based scheme has been used to group
similar shape characters to avoid confusion and to improve text alignment.
Next, the hypotheses locations are verified to improve retrieval performance.
To evaluate the proposed system for searching keyword from natural scene image
and video frames, we have considered two popular Indic scripts such as Bangla
(Bengali) and Devanagari along with English. Inspired by the zone-wise
recognition approach in Indic scripts[1], zone-wise text information has been
used to improve the traditional word spotting performance in Indic scripts. For
our experiment, a dataset consisting of images of different scenes and video
frames of English, Bangla and Devanagari scripts were considered. The results
obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe
- …