27,210 research outputs found
Recognition of off-line printed Arabic text using Hidden Markov Models.
yesThis paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256).
Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts.
The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages
Off-Line Cursive Script Recognition Based on Continuous Density HMM
A system for off-line cursive script recognition is presented. A new normalization technique (based on statistical methods) to compensate for the variability of writing style is described. The key problem of segmentation is avoided by applying a sliding window on the handwritten words. A feature vector is extracted from each frame isolated by the window. The feature vectors are used as observations in letter-oriented continuous density HMMs that perform the recognition. Feature extraction and modeling techniques are illustrated. In order to allow the comparison of the results, the system has been trained and tested using the same data and experimental conditions as in other published works. The performance of the system is evaluated in terms of character and word (with and without lexicon) recognition rate. Results comparable to those of more complex systems have been achieved
Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Recognizing scene text is a challenging problem, even more so than the
recognition of scanned documents. This problem has gained significant attention
from the computer vision community in recent years, and several methods based
on energy minimization frameworks and deep learning approaches have been
proposed. In this work, we focus on the energy minimization framework and
propose a model that exploits both bottom-up and top-down cues for recognizing
cropped words extracted from street images. The bottom-up cues are derived from
individual character detections from an image. We build a conditional random
field model on these detections to jointly model the strength of the detections
and the interactions between them. These interactions are top-down cues
obtained from a lexicon-based prior, i.e., language statistics. The optimal
word represented by the text image is obtained by minimizing the energy
function corresponding to the random field model. We evaluate our proposed
algorithm extensively on a number of cropped scene text benchmark datasets,
namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word,
and show better performance than comparable methods. We perform a rigorous
analysis of all the steps in our approach and analyze the results. We also show
that state-of-the-art convolutional neural network features can be integrated
in our framework to further improve the recognition performance
- …