43 research outputs found
A review of Arabic text recognition dataset
Building a robust Optical Character Recognition (OCR) system for languages, such as Arabic with cursive scripts,
has always been challenging. These challenges increase if the text contains diacritics of different sizes for
characters and words. Apart from the complexity of the used font, these challenges must be addressed in
recognizing the text of the Holy Quran. To solve these challenges, the OCR system would have to undergo
different phases. Each problem would have to be addressed using different approaches, thus, researchers are
studying these challenges and proposing various solutions. This has motivate this study to review Arabic OCR
dataset because the dataset plays a major role in determining the nature of the OCR systems. State-of-the-art
approaches in segmentation and recognition are discovered with the implementation of Recurrent Neural
Networks (Long Short-Term Memory-LSTM and Gated Recurrent Unit-GRU) with the use of the Connectionist
Temporal Classification (CTC). This also includes deep learning model and implementation of GRU in the Arabic
domain. This paper has contribute in profiling the Arabic text recognition dataset thus determining the nature of
OCR system developed and has identified research direction in building Arabic text recognition dataset
Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding
Retrieval of text information from natural scene images and video frames is a
challenging task due to its inherent problems like complex character shapes,
low resolution, background noise, etc. Available OCR systems often fail to
retrieve such information in scene/video frames. Keyword spotting, an
alternative way to retrieve information, performs efficient text searching in
such scenarios. However, current word spotting techniques in scene/video images
are script-specific and they are mainly developed for Latin script. This paper
presents a novel word spotting framework using dynamic shape coding for text
retrieval in natural scene image and video frames. The framework is designed to
search query keyword from multiple scripts with the help of on-the-fly
script-wise keyword generation for the corresponding script. We have used a
two-stage word spotting approach using Hidden Markov Model (HMM) to detect the
translated keyword in a given text line by identifying the script of the line.
A novel unsupervised dynamic shape coding based scheme has been used to group
similar shape characters to avoid confusion and to improve text alignment.
Next, the hypotheses locations are verified to improve retrieval performance.
To evaluate the proposed system for searching keyword from natural scene image
and video frames, we have considered two popular Indic scripts such as Bangla
(Bengali) and Devanagari along with English. Inspired by the zone-wise
recognition approach in Indic scripts[1], zone-wise text information has been
used to improve the traditional word spotting performance in Indic scripts. For
our experiment, a dataset consisting of images of different scenes and video
frames of English, Bangla and Devanagari scripts were considered. The results
obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe
Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey
Optical character recognition (OCR) is a vital process that involves the
extraction of handwritten or printed text from scanned or printed images,
converting it into a format that can be understood and processed by machines.
This enables further data processing activities such as searching and editing.
The automatic extraction of text through OCR plays a crucial role in digitizing
documents, enhancing productivity, improving accessibility, and preserving
historical records. This paper seeks to offer an exhaustive review of
contemporary applications, methodologies, and challenges associated with Arabic
Optical Character Recognition (OCR). A thorough analysis is conducted on
prevailing techniques utilized throughout the OCR process, with a dedicated
effort to discern the most efficacious approaches that demonstrate enhanced
outcomes. To ensure a thorough evaluation, a meticulous keyword-search
methodology is adopted, encompassing a comprehensive analysis of articles
relevant to Arabic OCR, including both backward and forward citation reviews.
In addition to presenting cutting-edge techniques and methods, this paper
critically identifies research gaps within the realm of Arabic OCR. By
highlighting these gaps, we shed light on potential areas for future
exploration and development, thereby guiding researchers toward promising
avenues in the field of Arabic OCR. The outcomes of this study provide valuable
insights for researchers, practitioners, and stakeholders involved in Arabic
OCR, ultimately fostering advancements in the field and facilitating the
creation of more accurate and efficient OCR systems for the Arabic language
Handwritten OCR for Indic Scripts: A Comprehensive Overview of Machine Learning and Deep Learning Techniques
The potential uses of cursive optical character recognition, commonly known as OCR, in a number of industries, particularly document digitization, archiving, even language preservation, have attracted a lot of interest lately. In the framework of optical character recognition (OCR), the goal of this research is to provide a thorough understanding of both cutting-edge methods and the unique difficulties presented by Indic scripts. A thorough literature search was conducted in order to conduct this study, during which time relevant publications, conference proceedings, and scientific files were looked for up to the year 2023. As a consequence of the inclusion criteria that were developed to concentrate on studies only addressing Handwritten OCR on Indic scripts, 53 research publications were chosen as the process's outcome. The review provides a thorough analysis of the methodology and approaches employed in the chosen study. Deep neural networks, conventional feature-based methods, machine learning techniques, and hybrid systems have all been investigated as viable answers to the problem of effectively deciphering Indian scripts, because they are famously challenging to write. To operate, these systems require pre-processing techniques, segmentation schemes, and language models. The outcomes of this methodical examination demonstrate that despite the fact that Hand Scanning for Indic script has advanced significantly, room still exists for advancement. Future research could focus on developing trustworthy models that can handle a range of writing styles and enhance accuracy using less-studied Indic scripts. This profession may advance with the creation of collected datasets and defined standards
A Study of Techniques and Challenges in Text Recognition Systems
The core system for Natural Language Processing (NLP) and digitalization is Text Recognition. These systems are critical in bridging the gaps in digitization produced by non-editable documents, as well as contributing to finance, health care, machine translation, digital libraries, and a variety of other fields. In addition, as a result of the pandemic, the amount of digital information in the education sector has increased, necessitating the deployment of text recognition systems to deal with it. Text Recognition systems worked on three different categories of text: (a) Machine Printed, (b) Offline Handwritten, and (c) Online Handwritten Texts. The major goal of this research is to examine the process of typewritten text recognition systems. The availability of historical documents and other traditional materials in many types of texts is another major challenge for convergence. Despite the fact that this research examines a variety of languages, the Gurmukhi language receives the most focus. This paper shows an analysis of all prior text recognition algorithms for the Gurmukhi language. In addition, work on degraded texts in various languages is evaluated based on accuracy and F-measure