380 research outputs found
Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey
Optical character recognition (OCR) is a vital process that involves the
extraction of handwritten or printed text from scanned or printed images,
converting it into a format that can be understood and processed by machines.
This enables further data processing activities such as searching and editing.
The automatic extraction of text through OCR plays a crucial role in digitizing
documents, enhancing productivity, improving accessibility, and preserving
historical records. This paper seeks to offer an exhaustive review of
contemporary applications, methodologies, and challenges associated with Arabic
Optical Character Recognition (OCR). A thorough analysis is conducted on
prevailing techniques utilized throughout the OCR process, with a dedicated
effort to discern the most efficacious approaches that demonstrate enhanced
outcomes. To ensure a thorough evaluation, a meticulous keyword-search
methodology is adopted, encompassing a comprehensive analysis of articles
relevant to Arabic OCR, including both backward and forward citation reviews.
In addition to presenting cutting-edge techniques and methods, this paper
critically identifies research gaps within the realm of Arabic OCR. By
highlighting these gaps, we shed light on potential areas for future
exploration and development, thereby guiding researchers toward promising
avenues in the field of Arabic OCR. The outcomes of this study provide valuable
insights for researchers, practitioners, and stakeholders involved in Arabic
OCR, ultimately fostering advancements in the field and facilitating the
creation of more accurate and efficient OCR systems for the Arabic language
InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write
Digital note-taking is gaining popularity, offering a durable, editable, and
easily indexable way of storing notes in the vectorized form, known as digital
ink. However, a substantial gap remains between this way of note-taking and
traditional pen-and-paper note-taking, a practice still favored by a vast
majority. Our work, InkSight, aims to bridge the gap by empowering physical
note-takers to effortlessly convert their work (offline handwriting) to digital
ink (online handwriting), a process we refer to as Derendering. Prior research
on the topic has focused on the geometric properties of images, resulting in
limited generalization beyond their training domains. Our approach combines
reading and writing priors, allowing training a model in the absence of large
amounts of paired samples, which are difficult to obtain. To our knowledge,
this is the first work that effectively derenders handwritten text in arbitrary
photos with diverse visual characteristics and backgrounds. Furthermore, it
generalizes beyond its training domain into simple sketches. Our human
evaluation reveals that 87% of the samples produced by our model on the
challenging HierText dataset are considered as a valid tracing of the input
image and 67% look like a pen trajectory traced by a human. Interactive
visualizations of 100 word-level model outputs for each of the three public
datasets are available in our Hugging Face space:
https://huggingface.co/spaces/Derendering/Model-Output-Playground. Model
release is in progress
Deep Learning: segmentation of documents from the Archivo General de Indias with DhSegment and NeuralLineSegmenter
The amount of information stored in the form of historical documents is enormous and their treatment is highly tedious. This work is intended to go one step further to facilitate the extraction of information from these documents. This is not easy since many of the historical documents are in bad condition, or their letter is practically illegible to the human eye. The aim of this project is to apply the technique of machine learning, specifically deep learning, to segment digitized images of these documents. That is, differentiate and separate the different areas that make up the document such as text, background or ornaments zones. This will allow each area to be processed separately, which would help to extract the information.Universidad de Sevilla. Máster en Ingeniería de Telecomunicació
- …