29 research outputs found
TEXT CONTENT DEPENDENT WRITER IDENTIFICATION
Text content based personal Identification system is vital in resolving problem of identifying unknown document’s writer using a
set of handwritten samples from alleged known writers. Text written on paper document is usually captured as image by scanner
or camera for computer processing. The most challenging problem encounter in text image processing is extraction of robust
feature vector from a set of inconstant handwritten text images obtained from the same writer at different time. In this work new
feature extraction method is engaged to produce active text features for developing an effective personal identification system.
The feature formed feature vector which is fed as input data into classification algorithm based on Support Vector Machine
(SVM). Experiment was conducted to identify writers of query handwritten texts. Result show satisfactory performance of the
proposed system, it was able to identify writers of query handwritten texts
WRITER IDENTIFICATION BY TEXTURE ANALYSIS BASED ON KANNADA HANDWRITING
Writer identification problem is one of the important area of research due to its various applications and is a challenging task. The major research on writer identification is based on handwritten English documents with text independent and dependent. However, there is no significant work on identification of writers based on Kannada document. Hence, in this paper, we propose a text-independent method for off-line writer identification based on Kannada handwritten scripts. By observing each individual’s handwriting as a different texture image, a set of features based on Discrete Cosine Transform, Gabor filtering and gray level co-occurrence matrix, are extracted from preprocessed document image blocks. Experimental results demonstrate that the Gabor energy features are more potential than the DCTs and GLCMs based features for writer identification from 20 people
Writer Identification of Arabic Handwritten Digits
This paper addresses the identification of Arabic handwritten digits. In addition to digit identifiability, the paper presents digit recognition. The digit image is divided into grids based on the distribution of the black pixels in the image. Several types of features are extracted (viz. gradient, curvature, density, horizontal and vertical run lengths, stroke, and concavity features) from the grid segments. K-Nearest Neighbor and Nearest Mean classifiers are used. A database of 70000 of Arabic handwritten digit samples written by 700 writers is used in the analysis and experimentations.
The identifiability of isolated and combined digits are tested. The analysis of the results indicates that Arabic digits 3 (٣), 4 (٤), 8 (٨), and 9 (٩) are more identifiable than other digits while Arabic digit 0 (٠) and 1 (١) are the least identifiable. In addition, the paper shows that combining the writer’s digits increases the discriminability power of Arabic handwritten digits. Combining the features of all digits, K-NN provided the best accuracy in text-independent writer identification with top-1 result of 88.14%, top-5 result of 94.81%, and top-10 results of 96.48%
Offline Writer Identification Using Convolutional Neural Network Activation Features
Convolutional neural networks (CNNs) have recently become the
state-of-the-art tool for large-scale image classification. In this work we
propose the use of activation features from CNNs as local descriptors for
writer identification. A global descriptor is then formed by means of GMM
supervector encoding, which is further improved by normalization with the
KL-Kernel. We evaluate our method on two publicly available datasets: the ICDAR
2013 benchmark database and the CVL dataset. While we perform comparably to the
state of the art on CVL, our proposed method yields about 0.21 absolute
improvement in terms of mAP on the challenging bilingual ICDAR dataset.Comment: fixed tab 1
Construction and evaluation of classifiers for forensic document analysis
In this study we illustrate a statistical approach to questioned document
examination. Specifically, we consider the construction of three classifiers
that predict the writer of a sample document based on categorical data. To
evaluate these classifiers, we use a data set with a large number of writers
and a small number of writing samples per writer. Since the resulting
classifiers were found to have near perfect accuracy using leave-one-out
cross-validation, we propose a novel Bayesian-based cross-validation method for
evaluating the classifiers.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS379 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Feature-extraction methods for historical manuscript dating based on writing style development
Paleographers and philologists perform significant research in finding the dates of ancient manuscripts to understand the historical contexts. To estimate these dates, the traditional process of using classical paleography is subjective, tedious, and often time-consuming. An automatic system based on pattern recognition techniques that infers these dates would be a valuable tool for scholars. In this study, the development of handwriting styles over time in the Dead Sea Scrolls, a collection of ancient manuscripts, is used to create a model that predicts the date of a query manuscript. In order to extract the handwriting styles, several dedicated feature-extraction techniques have been explored. Additionally, a self-organizing time map is used as a codebook. Support vector regression is used to estimate a date based on the feature vector of a manuscript. The date estimation from grapheme-based technique outperforms other feature-extraction techniques in identifying the chronological style development of handwriting in this study of the Dead Sea Scrolls