Search CORE

484 research outputs found

Zone Segmentation and Thinning based Algorithm for Segmentation of Devnagari Text

Author: Er. Japneet Kaur, Er. Suppandeep Kaur, D
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2015
Field of study

Character segmentation of handwritten documents is an challenging research topic due to its diverse application environment.OCR can be used for automated processing and handling of forms, old corrupted reports, bank cheques, postal codes and structures. Now Segmentation of a word into characters is one of the major challenge in optical character recognition. This is even more challenging when we segment characters in an offline handwritten document and the next hurdle is presence of broken ,touching and overlapped characters in devnagari script. So, in this paper we have introduced an algorithm that will segment both broken as well as touching characters in devnagari script. Now to segment these characters the algorithm uses both zone segmentation and thinning based techniques. We have used 85 words each for isolated, broken, touching and both broken as well as touching characters individually. Results achieved while segmentation of broken as well as touching are 96.2 % on an average

International Journal on Recent and Innovation Trends in Computing and Communication

A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation

Author
Publication venue: Springer
Publication date: 24/12/2015
Field of study

Springer - Publisher Connector

Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition

Author
Publication venue: 'MECS Publisher'
Publication date
Field of study

Crossref

MatriVasha: A Multipurpose Comprehensive Database for Bangla Handwritten Compound Characters

Author: Ferdous Jannatul
Hossain Syed Akhter
Karmaker Suvrajit
Rabby A K M Shahariar Azad
Publication venue
Publication date: 06/05/2020
Field of study

At present, recognition of the Bangla handwriting compound character has been an essential issue for many years. In recent years there have been application-based researches in machine learning, and deep learning, which is gained interest, and most notably is handwriting recognition because it has a tremendous application such as Bangla OCR. MatrriVasha, the project which can recognize Bangla, handwritten several compound characters. Currently, compound character recognition is an important topic due to its variant application, and helps to create old forms, and information digitization with reliability. But unfortunately, there is a lack of a comprehensive dataset that can categorize all types of Bangla compound characters. MatrriVasha is an attempt to align compound character, and it's challenging because each person has a unique style of writing shapes. After all, MatrriVasha has proposed a dataset that intends to recognize Bangla 120(one hundred twenty) compound characters that consist of 2552(two thousand five hundred fifty-two) isolated handwritten characters written unique writers which were collected from within Bangladesh. This dataset faced problems in terms of the district, age, and gender-based written related research because the samples were collected that includes a verity of the district, age group, and the equal number of males, and females. As of now, our proposed dataset is so far the most extensive dataset for Bangla compound characters. It is intended to frame the acknowledgment technique for handwritten Bangla compound character. In the future, this dataset will be made publicly available to help to widen the research.Comment: 19 fig, 2 tabl

arXiv.org e-Print Archive

Character Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

Directory of Open Access Books (DOAB)

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

Author: Jin Lianwen
Lyons Terry
Ni Hao
Sun Zenghui
Xie Zecheng
Publication venue
Publication date: 01/01/2017
Field of study

Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences. In this paper, we exploit the outstanding capability of path signature to translate online pen-tip trajectories into informative signature feature maps using a sliding window-based method, successfully capturing the analytic and geometric properties of pen strokes with strong local invariance and robustness. A multi-spatial-context fully convolutional recurrent network (MCFCRN) is proposed to exploit the multiple spatial contexts from the signature feature maps and generate a prediction sequence while completely avoiding the difficult segmentation problem. Furthermore, an implicit language model is developed to make predictions based on semantic context within a predicting feature sequence, providing a new perspective for incorporating lexicon constraints and prior knowledge about a certain language in the recognition procedure. Experiments on two standard benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with correct rates of 97.10% and 97.15%, respectively, which are significantly better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure

arXiv.org e-Print Archive

UCL Discovery

Oxford University Research Archive

Handwritten Arabic Documents Segmentation into Text Lines using Seam Carving

Author: Daldali M
Souhar Abdelghani
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 24/02/2022
Field of study

Inspired from human perception and common text documents characteristics based on readability constraints, an Arabic text line segmentation approach is proposed using seam carving. Taking the gray scale of the image as input data, this technique offers better results at extracting handwritten text lines without the need for the binary representation of the document image. In addition to its fast processing time, its versatility permits to process a multitude of document types, especially documents presenting low text-to-background contrast such as degraded historical manuscripts or complex writing styles like cursive handwriting. Even if our focus in this paper was on Arabic text segmentation, this method is language independent. Tests on a public database of 123 handwritten Arabic documents showed a line detection rate of 97.5% for a matching score of 90%

Re-UNIR