Search CORE

86 research outputs found

A Study of Techniques and Challenges in Text Recognition Systems

Author: Kaur Gurvir
Kumar Ajit
Publication venue: Auricle Global Society of Education and Research
Publication date: 20/09/2023
Field of study

The core system for Natural Language Processing (NLP) and digitalization is Text Recognition. These systems are critical in bridging the gaps in digitization produced by non-editable documents, as well as contributing to finance, health care, machine translation, digital libraries, and a variety of other fields. In addition, as a result of the pandemic, the amount of digital information in the education sector has increased, necessitating the deployment of text recognition systems to deal with it. Text Recognition systems worked on three different categories of text: (a) Machine Printed, (b) Offline Handwritten, and (c) Online Handwritten Texts. The major goal of this research is to examine the process of typewritten text recognition systems. The availability of historical documents and other traditional materials in many types of texts is another major challenge for convergence. Despite the fact that this research examines a variety of languages, the Gurmukhi language receives the most focus. This paper shows an analysis of all prior text recognition algorithms for the Gurmukhi language. In addition, work on degraded texts in various languages is evaluated based on accuracy and F-measure

International Journal on Recent and Innovation Trends in Computing and Communication

Segmentation of touching characters in upper zone in printed Gurmukhi script

Author: G. S. Lehal
M. K. Jindal
R. K. Sharma
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

A new technique for segmenting touching characters in upper zone of printed Gurmukhi script has been presented in this paper. The technique is based on the structural properties of the Gurmukhi script characters. Concavity and convexity of the characters has been studied and using top profile projections, the touching characters in upper zone have been segmented. Recognition rate of 91 % has been achieved for segmenting the touching characters in upper zone

CiteSeerX

Crossref

A Technique for Character Segmentation in Middle zone of Handwritten Hindi words using Hybrid Approach

Author: Preeti Sharma, Manoj Kumar Sachan
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/07/2017
Field of study

India is a country where people talk in multilingual and write in multi-script. Devanagari is one of the most popular scripts in India, which is used to write Hindi, Sanskrit, Sindhi, Marathi and Nepali Languages. This research work is performed on Hindi language. A large number of precious and essential documents are available in handwritten form, which needs to be converted into editable form. The existence of Optical Character Recognition (OCR) makes this task easier to convert handwritten text in editable form. Character segmentation is an important phase of OCR, which segment the characters from handwritten words. This enhances the accuracy of OCR system. In this paper a hybrid approach is used to segment the characters that contain single and multiple touching characters within a word. The proposed system is tested on a dataset of various handwritten words written by different writers. The dataset of proposed system contains more than 300 handwritten words in Hindi language. Accuracy of the proposed hybrid system is evaluated to 96% which is better than that of existing techniques

International Journal on Future Revolution in Computer Science & Communication Engineering

Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition

Author
Publication venue: 'MECS Publisher'
Publication date
Field of study

Crossref

Deep Learning Based Models for Offline Gurmukhi Handwritten Character and Numeral Recognition

Author: Bhatia Karamjit
Mahto Manoj Kumar
Sharma Rajendra Kumar
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2021
Field of study

Over the last few years, several researchers have worked on handwritten character recognition and have proposed various techniques to improve the performance of Indic and non-Indic scripts recognition. Here, a Deep Convolutional Neural Network has been proposed that learns deep features for offline Gurmukhi handwritten character and numeral recognition (HCNR). The proposed network works efficiently for training as well as testing and exhibits a good recognition performance. Two primary datasets comprising of offline handwritten Gurmukhi characters and Gurmukhi numerals have been employed in the present work. The testing accuracies achieved using the proposed network is 98.5% for characters and 98.6% for numerals

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB

Segmentation of Hanacaraka Characters using Double Projection Profile and Hough Transform

Author: Adipranata Rudy
Budhi Gregorius Satia
Liliana
Publication venue: Springer International Publishing
Publication date: 27/10/2018
Field of study

In doing segmentation of Hanacaraka character, Javanese ancient character, one of Indonesian’s ethnic ancient character in Java island, the difficulties that occur is the inconsistency of the space between lines, the size of the character and the thickness. Inconsistencies between row spacing and letter size are caused by the letters of the pair, the last vowel and consonant letters in one phoneme. While the thickness is inconsistent due to the writing style of the Hanacaraka itself. Image Preprocessing needs to be done to get input without skew. To improve skewed text documents, we used Hough transforms to predict the edges of the text area. After that, to segment the line and then continue with segmentation of each character, horizontal projection profile is used and then proceed with vertical. The result of this segmentation method is good for printed documents. Segmentation process of handwriting documents has difficulty because each row in the document is uneven and very tight between the rows. Those matters cause them overlap. When the line segmented wrongly, the entire character on the line will be not segmented as well. This problem can be eliminate using connectivity test. Before this, it need to segment the line with the overlap area. The character part of below or above the main character can be eliminate because it is not connected to the main character

Scientific Repository