618 research outputs found
Offline Recognition of Malayalam and Kannada Handwritten Documents Using Deep Learning
For a variety of reasons, handwritten text can be digitalized. It is used in a variety of government entities, including banks, post offices, and archaeological departments. Handwriting recognition, on the other hand, is a difficult task as everyone has a different writing style. There are essentially two methods for handwritten recognition: a holistic and an analytic approach. The previous methods of handwriting recognition are time- consuming. However, as deep neural networks have progressed, the approach has become more straightforward than previous methods. Furthermore, the bulk of existing solutions are limited to a single language. To recognise multilanguage handwritten manuscripts offline, this work employs an analytic approach. It describes how to convert Malayalam and Kannada handwritten manuscripts into editable text. Lines are separated from the input document first. After that, word segmentation is performed. Finally, each word is broken down into individual characters. An artificial neural network is utilised for feature extraction and classification. After that, the result is converted to a word document
Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval
Recognition and retrieval of textual content from the large document
collections have been a powerful use case for the document image analysis
community. Often the word is the basic unit for recognition as well as
retrieval. Systems that rely only on the text recogniser (OCR) output are not
robust enough in many situations, especially when the word recognition rates
are poor, as in the case of historic documents or digital libraries. An
alternative has been word spotting based methods that retrieve/match words
based on a holistic representation of the word. In this paper, we fuse the
noisy output of text recogniser with a deep embeddings representation derived
out of the entire word. We use average and max fusion for improving the ranked
results in the case of retrieval. We validate our methods on a collection of
Hindi documents. We improve word recognition rate by 1.4 and retrieval by 11.13
in the mAP.Comment: 15 pages, 8 figures, Accepted in IAPR International Workshop on
Document Analysis Systems (DAS) 2020, "Visit project page, at
http://cvit.iiit.ac.in/research/projects/cvit-projects/fused-text-recogniser-and-deep-embeddings-improve-word-recognition-and-retrieval
Hanwrittent Text Recognition for Bengali
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Handwritten text recognition of Bengali
is a difficult task because of complex character shapes
due to the presence of modified/compound characters
as well as zone-wise writing styles of different individuals.
Most of the research published so far on Bengali
handwriting recognition deals with either isolated
character recognition or isolated word recognition,
and just a few papers have researched on recognition
of continuous handwritten Bengali. In this paper
we present a research on continuous handwritten
Bengali. We follow a classical line-based recognition
approach with a system based on hidden Markov
models and n-gram language models. These models
are trained with automatic methods from annotated
data. We research both on the maximum likelihood
approach and the minimum error phone approach for
training the optical models. We also research on the
use of word-based language models and characterbased
language models. This last approach allow us
to deal with the out-of-vocabulary word problem in
the test when the training set is of limited size. From
the experiments we obtained encouraging results.This work has been partially supported through the European Union’s H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943) and partially supported by MINECO/FEDER, UE under project TIN2015-70924-C2-1-R.Sánchez Peiró, JA.; Pal, U. (2016). Hanwrittent Text Recognition for Bengali. IEEE. https://doi.org/10.1109/ICFHR.2016.010
Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
Interpretation of different writing styles, unconstrained cursiveness and
relationship between different primitive parts is an essential and challenging
task for recognition of handwritten characters. As feature representation is
inadequate, appropriate interpretation/description of handwritten characters
seems to be a challenging task. Although existing research in handwritten
characters is extensive, it still remains a challenge to get the effective
representation of characters in feature space. In this paper, we make an
attempt to circumvent these problems by proposing an approach that exploits the
robust graph representation and spectral graph embedding concept to
characterise and effectively represent handwritten characters, taking into
account writing styles, cursiveness and relationships. For corroboration of the
efficacy of the proposed method, extensive experiments were carried out on the
standard handwritten numeral Computer Vision Pattern Recognition, Unit of
Indian Statistical Institute Kolkata dataset. The experimental results
demonstrate promising findings, which can be used in future studies.Comment: 16 pages, 8 figure
- …