3 research outputs found
A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks
Deep LSTM is an ideal candidate for text recognition. However text
recognition involves some initial image processing steps like segmentation of
lines and words which can induce error to the recognition system. Without
segmentation, learning very long range context is difficult and becomes
computationally intractable. Therefore, alternative soft decisions are needed
at the pre-processing level. This paper proposes a hybrid text recognizer using
a deep recurrent neural network with multiple layers of abstraction and long
range context along with a language model to verify the performance of the deep
neural network. In this paper we construct a multi-hypotheses tree architecture
with candidate segments of line sequences from different segmentation
algorithms at its different branches. The deep neural network is trained on
perfectly segmented data and tests each of the candidate segments, generating
unicode sequences. In the verification step, these unicode sequences are
validated using a sub-string match with the language model and best first
search is used to find the best possible combination of alternative hypothesis
from the tree structure. Thus the verification framework using language models
eliminates wrong segmentation outputs and filters recognition errors
A Novel Deep Convolutional Neural Network Architecture Based on Transfer Learning for Handwritten Urdu Character Recognition
Deep convolutional neural networks (CNN) have made a huge impact on computer vision and set the state-of-the-art in providing extremely definite classification results. For character recognition, where the training images are usually inadequate, mostly transfer learning of pre-trained CNN is often utilized. In this paper, we propose a novel deep convolutional neural network for handwritten Urdu character recognition by transfer learning three pre-trained CNN models. We fine-tuned the layers of these pre-trained CNNs so as to extract features considering both global and local details of the Urdu character structure. The extracted features from the three CNN models are concatenated to train with two fully connected layers for classification. The experiment is conducted on UNHD, EMILLE, DBAHCL, and CDB/Farsi dataset, and we achieve 97.18% average recognition accuracy which outperforms the individual CNNs and numerous conventional classification methods