14,012 research outputs found
WRITER IDENTIFICATION BY TEXTURE ANALYSIS BASED ON KANNADA HANDWRITING
Writer identification problem is one of the important area of research due to its various applications and is a challenging task. The major research on writer identification is based on handwritten English documents with text independent and dependent. However, there is no significant work on identification of writers based on Kannada document. Hence, in this paper, we propose a text-independent method for off-line writer identification based on Kannada handwritten scripts. By observing each individualâs handwriting as a different texture image, a set of features based on Discrete Cosine Transform, Gabor filtering and gray level co-occurrence matrix, are extracted from preprocessed document image blocks. Experimental results demonstrate that the Gabor energy features are more potential than the DCTs and GLCMs based features for writer identification from 20 people
Handwritten Document Analysis for Automatic Writer Recognition
In this paper, we show that both the writer identification and the writer verification tasks can be carried out using local features such as graphemes extracted from the segmentation of cursive handwriting. We thus enlarge the scope of the possible use of these two tasks which have been, up to now, mainly evaluated on script handwritings. A textual based Information Retrieval model is used for the writer identification stage. This allows the use of a particular feature space based on feature frequencies. Image queries are handwritten documents projected in this feature space. The approach achieves 95% correct identification on the PSI_DataBase and 86% on the IAM_DataBase. Then writer hypothesis retrieved are analysed during a verification phase. We call upon a mutual information criterion to verify that two documents may have been produced by the same writer or not. Hypothesis testing is used for this purpose. The proposed method is first scaled on the PSI_DataBase then evaluated on the IAM_DataBase. On both databases, similar performance of nearly 96% correct verification is reported, thus making the approach general and very promising for large scale applications in the domain of handwritten document querying and writer verification
Writer identification and verification in handwritten documents
In this communication we apply an Information Retrieval model for the writer identification task. Queries are
handwreitten document images projected on a suitable feature set. The handwritten document database is indexed
according to the vector space model originaly used for textual information. The approach uses both the image and
textual description of handwritten documents. Identified documents are then processed by the verification stage. We use
a mutual information criterion so as to verify that each identified document can have been written by the writer of the
query. Decision operates using an hypothesis test. The approcah is evaluated on two different database and proves to
be robust to the variability of handwriting. Perspectives are oriented towards the use of large handwritten document
databaseDans cette communication, nous appliquons un modĂšle de recherche dâinformation pour la tĂąche
dâidentification du scripteur. Les requĂȘtes sont des images de documents qui sont tout dâabord projetĂ©es dans
un espace de caractéristiques. La base de documents manuscrits est indexée selon le principe du modÚle
vectoriel de recherche dâinformation textuelle. Lâapproche exploite donc Ă la fois la reprĂ©sentation mixte image
et textuelle spĂ©cifique dâun document manuscrit. Les documents identifiĂ©s Ă lâissue de cette Ă©tape font
ensuite lâobjet dâune analyse complĂ©mentaire pour vĂ©rifier les hypothĂšses Ă©mises. Nous proposons dâutiliser
un critĂšre dâinformation mutuelle pour vĂ©rifier que chacun des documents identifiĂ©s peut avoir Ă©tĂ© produit par
le scripteur de la requĂȘte. Nous utilisons un test dâhypothĂšse Ă cet effet. Lâapproche est testĂ©e sur deux bases
dâĂ©critures diffĂ©rentes et montre une grande robustesse aux diffĂ©rentes Ă©critures. Lâapproche semble donc
trĂšs intĂ©ressante pour des applications Ă plus grande Ă©chelle nĂ©cessitant dâinterroger des bases de
documents manuscrits
Analysis of texture and connected-component contours for the automatic identification of writers
Recent advances in "off-line" writer identification allow for new applications in handwritten text retrieval from archives of scanned historical documents. This paper describes new algorithms for forensic or historical writer identification, using the contours of fragmented connected-components in free-style handwriting. The writer is considered to be characterized by a stochastic pattern generator, producing a family of character fragments (fraglets). Using a codebook of such fraglets from an independent training set, the probability distribution of fraglet contours was computed for an independent test set. Results revealed a high sensitivity of the fraglet histogram in identifying individual writers on the basis of a paragraph of text. Large-scale experiments on the optimal size of Kohonen maps of fraglet contours were performed, showing usable classification rates within a non-critical range of Kohonen map dimensions. The proposed automatic approach bridges the gap between image-statistics approaches and purely knowledge-based manual character-based methods
Writer identification using curvature-free features
Feature engineering takes a very important role in writer identification which has been widely studied in the literature. Previous works have shown that the joint feature distribution of two properties can improve the performance. The joint feature distribution makes feature relationships explicit instead of roping that a trained classifier picks up a non-linear relation present in the data. In this paper, we propose two novel and curvature-free features: run-lengths of local binary pattern (LBPruns) and cloud of line distribution (COLD) features for writer identification. The LBPruns is the joint distribution of the traditional run-length and local binary pattern (LBP) methods, which computes the run-lengths of local binary patterns on both binarized and gray scale images. The COLD feature is the joint distribution of the relation between orientation and length of line segments obtained from writing contours in handwritten documents. Our proposed LBPruns and COLD are textural-based curvature-free features and capture the line information of handwritten texts instead of the curvature information. The combination of the LBPruns and COLD features provides a significant improvement on the CERUG data set, handwritten documents on which contain a large number of irregular-curvature strokes. The results of proposed features evaluated on other two widely used data sets (Firemaker and IAM) demonstrate promising results
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Novel geometric features for off-line writer identification
Writer identification is an important field in forensic document examination. Typically, a writer identification system consists of two main steps: feature extraction and matching and the performance depends significantly on the feature extraction step. In this paper, we propose a set of novel geometrical features that are able to characterize different writers. These features include direction, curvature, and tortuosity. We also propose an improvement of the edge-based directional and chain code-based features. The proposed methods are applicable to Arabic and English handwriting. We have also studied several methods for computing the distance between feature vectors when comparing two writers. Evaluation of the methods is performed using both the IAM handwriting database and the QUWI database for each individual feature reaching Top1 identification rates of 82 and 87 % in those two datasets, respectively. The accuracies achieved by Kernel Discriminant Analysis (KDA) are significantly higher than those observed before feature-level writer identification was implemented. The results demonstrate the effectiveness of the improved versions of both chain-code features and edge-based directional features
- âŠ