Search CORE

81 research outputs found

A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

Author: Chaudhury Santanu
Rajeswar Sai
Ray Anupama
Publication venue
Publication date: 26/02/2015
Field of study

Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors

arXiv.org e-Print Archive

Crossref

Reconocimiento óptico de fuentes en inglés en documentos de imágenes utilizando eigenfaces

Author: Al-Khaffaf Hasan S. M.
Musa Nadia A.
Publication venue: 'Universidad de Santander - UDES'
Publication date: 28/12/2018
Field of study

Introduction: In this paper, a system for recognizing fonts has been designed and implemented. The system is based on the Eigenfaces method. Because font recognition works in conjunction with other methods like Optical Character Recognition (OCR), we used Decapod and OCRopus software as a framework to present the method. Materials and Methods: In our experiments, text typeset with three English fonts (Comic Sans MS, DejaVu Sans Condensed,Times New Roman) have been used. Results and Discussion: The system is tested thoroughly using synthetic and degraded data. The experimental results show that Eigenfaces algorithm is very good at recognizing fonts of synthetic clean data as well as degraded data. The correct recognition rate for synthetic data for Eigenfaces is 99% based on Euclidean Distance. The overall accuracy of Eigenfaces is 97% based on 6144 degraded samples and considering Euclidean Distance performance criterion. Conclusions: It is concluded from the experimental results that the Eigenfaces method is suitable for font recognition of degraded documents. The three percentage incorrect classification can be mediated by relying on intra-word font information

Revistas UDES (Universidad de Santander)

Semi-supervised spectral clustering with automatic propagation of pairwise constraints

Author: Benoit Alexandre
Filip Andrei
Ionescu Bogdan
Lambert Patrick
Voiron Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/06/2015
Field of study

International audienceIn our data driven world, clustering is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on ground truth to perform the classification and are usually subject to overtraining issues. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, unsupervised learning tends to provide inferior results to supervised learning. To boost their performance, a compromise is to use learning only for some of the ambiguous classes. In this context, this paper studies the impact of pairwise constraints to unsupervised Spectral Clustering. We introduce a new generalization of constraint propagation which maximizes partitioning quality while reducing annotation costs. Experiments show the efficiency of the proposed scheme

Crossref

Hal - Université Grenoble Alpes

HAL Université de Savoie

Variable Selection and Feature Extraction Through Artificial Intelligence Techniques

Author: Cateni Silvia
Colla Valentina
Vannocci Marco
Vannucci Marco
Publication venue: 'IntechOpen'
Publication date: 01/01/2013
Field of study

IntechOpen

Crossref

Archivio della ricerca della Scuola Superiore Sant'Anna

Benchmark Classification of Handwritten Dataset by New Operator

Author: Rajiv Ranjan, Mohit Vats, Sachin Jain
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2015
Field of study

In recent years, many new classifiers and feature extraction algorithms were proposed and tested on various OCR databases and these techniques were used in wide applications. Various systematic papers and inventions in OCR were reported in the literature. We can say that OCR is one of the most important and active research areas in the pattern recognition. Today, research OCR is dealing with diverse a character of complex problems. Important research in OCR includes the text degraded (heavy noise) and analysis/recognition of complex documents (including texts, images, graphs, tables and video documents). In this proposed system we are suing a new operator Recognition of Devnagari handwritten Characters one of the biggest problem in present scenario. Devnagari characters are not recognized efficiently and truthfully by electronic device. Many researchers and algorithm have been proposed for recognizing of characters. For recognizing of characters, many processes have to be performed but no single technique or algorithm can perform that recognition and give more accurate result. objective of this dissertation work is to propose a new operator, the name of this operator is Kirsch Operator and algorithm for getting accurate result

International Journal on Recent and Innovation Trends in Computing and Communication

Deep Learning vs Spectral Clustering into an active clustering with pairwise constraints propagation

Author: Benoit Alexandre
Ionescu Bogdan
Lambert Patrick
Voiron Nicolas
Publication venue: HAL CCSD
Publication date: 15/06/2016
Field of study

International audienceIn our data driven world, categorization is of major importance to help end-users and decision makers understanding information structures. Supervised learning techniques rely on annotated samples that are often difficult to obtain and training often overfits. On the other hand, unsupervised clustering techniques study the structure of the data without disposing of any training data. Given the difficulty of the task, supervised learning often outperforms unsupervised learning. A compromise is to use a partial knowledge, selected in a smart way, in order to boost performance while minimizing learning costs, what is called semi-supervised learning. In such use case, Spectral Clustering proved to be an efficient method. Also, Deep Learning outperformed several state of the art classification approaches and it is interesting to test it in our context. In this paper, we firstly introduce the concept of Deep Learning into an active semi-supervised clustering process and compare it with Spectral Clustering. Secondly, we introduce constraint propagation and demonstrate how it maximizes partitioning quality while reducing annotation costs. Experimental validation is conducted on two different real datasets. Results show the potential of the clustering methods

Hal - Université Grenoble Alpes

HAL Université de Savoie

Disambiguating Visual Verbs

Author: Gella Spandana
Keller Frank
Lapata Mirella
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2019
Field of study

Edinburgh Research Explorer