Search CORE

1,221 research outputs found

Unconstrained Scene Text and Video Text Recognition for Arabic Script

Author: Jain Mohit
Jawahar C. V.
Mathew Minesh
Publication venue
Publication date: 07/11/2017
Field of study

Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesising millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNNs to model contextual dependencies yields superior recognition results.Comment: 5 page

arXiv.org e-Print Archive

Crossref

Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

Author: Alahari Karteek
Jawahar C. V.
Mishra Anand
Publication venue: 'Elsevier BV'
Publication date: 12/01/2016
Field of study

Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Recent Trends and Techniques in Text Detection and Text Localization in a Natural Scene: A Survey

Author: Das Pranab
Prasad Vijay
Publication venue: Assam Don Bosco University
Publication date: 30/06/2021
Field of study

Text information extraction from natural scene images is a rising area of research. Since text in natural scene images generally carries valuable details, detecting and recognizing scene text has been deemed essential for a variety of advanced computer vision applications. There has been a lot of effort put into extracting text regions from scene text images in an effective and reliable manner. As most text recognition applications have high demand of robust algorithms for detecting and localizing texts from a given scene text image, so the researchers mainly focus on the two important stages text detection and text localization. This paper provides a review of various techniques of text detection and text localization

Assam Don Bosco University Journals

Figure Text Extraction in Biomedical Literature

Author: A Ahmed
B Gatos
B Martins
B Rafkind
C Ringlstetter
C Thillou
CE Kahn
D Chen
D Chen
D Glasner
D Kim
Daehyun Kim
DH Kim
EM Riseman
EM Zamora
FJ Damerau
H Hsieh
H Shatkay
H Shatkay
H Stehouwer
H Yu
H Yu
H Yu
H Yu
Hong Yu
JD Thompson
JJ Weinman
M Anthimopoulos
M Donoser
M Li
M Paterson
MA Hearst
MP Jones
MP Schambach
P Ruch
P Shivakumara
R Fattal
R Gonzalez
RA Wagner
RF Murphy
RL Kashyap
S Agarwal
S Agarwal
S Xu
SM Lucas
V Hodge
VI Levenshtein
Vladimir N. Uversky
X Chen
X Chen
X Tong
Y Qian
Z Kou
Z Liu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engin

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central