1,723 research outputs found
Advances in Handwritten Keyword Indexing and Search Technologies
Many extensive manuscript collections are available in archives and libraries all over the world, but their textual contents remain practically inaccessible, buried under thousands of terabytes worth of high-resolution images. If perfect or sufficiently accurate text-image transcripts were available, textual content could be indexed directly for plaintext access using conventional information retrieval systems. But the results of fully automated transcriptions generally lack the level of accuracy needed for reliable text indexing and search purposes. Additionally, manual or even computer-assited transcription is entierely unsustainable when dealing with the extensive image collections typically considered for indexing. This paper explains how accurate indexing and search commands can be implemented directly on the digital images themselves without the need to explicitly resort to image transcripts. Results obtained using the proposed techniques on several relevant historical data sets are presented, clearly supporting the considerable potential of these technologies
Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation
International audienceText embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based OCRs that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods
Does color modalities affect handwriting recognition? An empirical study on Persian handwritings using convolutional neural networks
Most of the methods on handwritten recognition in the literature are focused
and evaluated on Black and White (BW) image databases. In this paper we try to
answer a fundamental question in document recognition. Using Convolutional
Neural Networks (CNNs), as eye simulator, we investigate to see whether color
modalities of handwritten digits and words affect their recognition accuracy or
speed? To the best of our knowledge, so far this question has not been answered
due to the lack of handwritten databases that have all three color modalities
of handwritings. To answer this question, we selected 13,330 isolated digits
and 62,500 words from a novel Persian handwritten database, which have three
different color modalities and are unique in term of size and variety. Our
selected datasets are divided into training, validation, and testing sets.
Afterwards, similar conventional CNN models are trained with the training
samples. While the experimental results on the testing set show that CNN on the
BW digit and word images has a higher performance compared to the other two
color modalities, in general there are no significant differences for network
accuracy in different color modalities. Also, comparisons of training times in
three color modalities show that recognition of handwritten digits and words in
BW images using CNN is much more efficient
Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding
Retrieval of text information from natural scene images and video frames is a
challenging task due to its inherent problems like complex character shapes,
low resolution, background noise, etc. Available OCR systems often fail to
retrieve such information in scene/video frames. Keyword spotting, an
alternative way to retrieve information, performs efficient text searching in
such scenarios. However, current word spotting techniques in scene/video images
are script-specific and they are mainly developed for Latin script. This paper
presents a novel word spotting framework using dynamic shape coding for text
retrieval in natural scene image and video frames. The framework is designed to
search query keyword from multiple scripts with the help of on-the-fly
script-wise keyword generation for the corresponding script. We have used a
two-stage word spotting approach using Hidden Markov Model (HMM) to detect the
translated keyword in a given text line by identifying the script of the line.
A novel unsupervised dynamic shape coding based scheme has been used to group
similar shape characters to avoid confusion and to improve text alignment.
Next, the hypotheses locations are verified to improve retrieval performance.
To evaluate the proposed system for searching keyword from natural scene image
and video frames, we have considered two popular Indic scripts such as Bangla
(Bengali) and Devanagari along with English. Inspired by the zone-wise
recognition approach in Indic scripts[1], zone-wise text information has been
used to improve the traditional word spotting performance in Indic scripts. For
our experiment, a dataset consisting of images of different scenes and video
frames of English, Bangla and Devanagari scripts were considered. The results
obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe
Character Recognition
Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field
Stroke-Based Cursive Character Recognition
International audienceHuman eye can see and read what is written or displayed either in natural handwriting or in printed format. The same work in case the machine does is called handwriting recognition. Handwriting recognition can be broken down into two categories: off-line and on-line. ..
Recent Trends and Techniques in Text Detection and Text Localization in a Natural Scene: A Survey
Text information extraction from natural scene images is a rising area of research. Since text in natural scene images generally carries valuable details, detecting and recognizing scene text has been deemed essential for a variety of advanced computer vision applications. There has been a lot of effort put into extracting text regions from scene text images in an effective and reliable manner. As most text recognition applications have high demand of robust algorithms for detecting and localizing texts from a given scene text image, so the researchers mainly focus on the two important stages text detection and text localization. This paper provides a review of various techniques of text detection and text localization
- …