30,022 research outputs found
Sparse Radial Sampling LBP for Writer Identification
In this paper we present the use of Sparse Radial Sampling Local Binary
Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture
classification. By adapting and extending the standard LBP operator to the
particularities of text we get a generic text-as-texture classification scheme
and apply it to writer identification. In experiments on CVL and ICDAR 2013
datasets, the proposed feature-set demonstrates State-Of-the-Art (SOA)
performance. Among the SOA, the proposed method is the only one that is based
on dense extraction of a single local feature descriptor. This makes it fast
and applicable at the earliest stages in a DIA pipeline without the need for
segmentation, binarization, or extraction of multiple features.Comment: Submitted to the 13th International Conference on Document Analysis
and Recognition (ICDAR 2015
Novel geometric features for off-line writer identification
Writer identification is an important field in forensic document examination. Typically, a writer identification system consists of two main steps: feature extraction and matching and the performance depends significantly on the feature extraction step. In this paper, we propose a set of novel geometrical features that are able to characterize different writers. These features include direction, curvature, and tortuosity. We also propose an improvement of the edge-based directional and chain code-based features. The proposed methods are applicable to Arabic and English handwriting. We have also studied several methods for computing the distance between feature vectors when comparing two writers. Evaluation of the methods is performed using both the IAM handwriting database and the QUWI database for each individual feature reaching Top1 identification rates of 82 and 87 % in those two datasets, respectively. The accuracies achieved by Kernel Discriminant Analysis (KDA) are significantly higher than those observed before feature-level writer identification was implemented. The results demonstrate the effectiveness of the improved versions of both chain-code features and edge-based directional features
Print-Scan Resilient Text Image Watermarking Based on Stroke Direction Modulation for Chinese Document Authentication
Print-scan resilient watermarking has emerged as an attractive way for document security. This paper proposes an stroke direction modulation technique for watermarking in Chinese text images. The watermark produced by the idea offers robustness to print-photocopy-scan, yet provides relatively high embedding capacity without losing the transparency. During the embedding phase, the angle of rotatable strokes are quantized to embed the bits. This requires several stages of preprocessing, including stroke generation, junction searching, rotatable stroke decision and character partition. Moreover, shuffling is applied to equalize the uneven embedding capacity. For the data detection, denoising and deskewing mechanisms are used to compensate for the distortions induced by hardcopy. Experimental results show that our technique attains high detection accuracy against distortions resulting from print-scan operations, good quality photocopies and benign attacks in accord with the future goal of soft authentication
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
Interactive Transcription of Old Text Documents
Nowadays, there are huge collections of handwritten text documents in libraries
all over the world. The high demand for these resources has led to the creation
of digital libraries in order to facilitate the preservation and provide electronic
access to these documents. However text transcription of these documents im-
ages are not always available to allow users to quickly search information, or
computers to process the information, search patterns or draw out statistics.
The problem is that manual transcription of these documents is an expensive
task from both economical and time viewpoints. This thesis presents a novel ap-
proach for e cient Computer Assisted Transcription (CAT) of handwritten text
documents using state-of-the-art Handwriting Text Recognition (HTR) systems.
The objective of CAT approaches is to e ciently complete a transcription
task through human-machine collaboration, as the e ort required to generate a
manual transcription is high, and automatically generated transcriptions from
state-of-the-art systems still do not reach the accuracy required. This thesis
is centered on a special application of CAT, that is, the transcription of old
text document when the quantity of user e ort available is limited, and thus,
the entire document cannot be revised. In this approach, the objective is to
generate the best possible transcription by means of the user e ort available.
This thesis provides a comprehensive view of the CAT process from feature
extraction to user interaction.
First, a statistical approach to generalise interactive transcription is pro-
posed. As its direct application is unfeasible, some assumptions are made to
apply it to two di erent tasks. First, on the interactive transcription of hand-
written text documents, and next, on the interactive detection of the document
layout.
Next, the digitisation and annotation process of two real old text documents
is described. This process was carried out because of the scarcity of similar
resources and the need of annotated data to thoroughly test all the developed
tools and techniques in this thesis. These two documents were carefully selected
to represent the general di culties that are encountered when dealing with
HTR. Baseline results are presented on these two documents to settle down a
benchmark with a standard HTR system. Finally, these annotated documents
were made freely available to the community. It must be noted that, all the
techniques and methods developed in this thesis have been assessed on these
two real old text documents.
Then, a CAT approach for HTR when user e ort is limited is studied and
extensively tested. The ultimate goal of applying CAT is achieved by putting
together three processes. Given a recognised transcription from an HTR system.
The rst process consists in locating (possibly) incorrect words and employs the
user e ort available to supervise them (if necessary). As most words are not
expected to be supervised due to the limited user e ort available, only a few are
selected to be revised. The system presents to the user a small subset of these
words according to an estimation of their correctness, or to be more precise,
according to their con dence level. Next, the second process starts once these low con dence words have been supervised. This process updates the recogni-
tion of the document taking user corrections into consideration, which improves
the quality of those words that were not revised by the user. Finally, the last
process adapts the system from the partially revised (and possibly not perfect)
transcription obtained so far. In this adaptation, the system intelligently selects
the correct words of the transcription. As results, the adapted system will bet-
ter recognise future transcriptions. Transcription experiments using this CAT
approach show that this approach is mostly e ective when user e ort is low.
The last contribution of this thesis is a method for balancing the nal tran-
scription quality and the supervision e ort applied using our previously de-
scribed CAT approach. In other words, this method allows the user to control
the amount of errors in the transcriptions obtained from a CAT approach. The
motivation of this method is to let users decide on the nal quality of the desired
documents, as partially erroneous transcriptions can be su cient to convey the
meaning, and the user e ort required to transcribe them might be signi cantly
lower when compared to obtaining a totally manual transcription. Consequently,
the system estimates the minimum user e ort required to reach the amount of
error de ned by the user. Error estimation is performed by computing sepa-
rately the error produced by each recognised word, and thus, asking the user to
only revise the ones in which most errors occur.
Additionally, an interactive prototype is presented, which integrates most
of the interactive techniques presented in this thesis. This prototype has been
developed to be used by palaeographic expert, who do not have any background
in HTR technologies. After a slight ne tuning by a HTR expert, the prototype
lets the transcribers to manually annotate the document or employ the CAT ap-
proach presented. All automatic operations, such as recognition, are performed
in background, detaching the transcriber from the details of the system. The
prototype was assessed by an expert transcriber and showed to be adequate and
e cient for its purpose. The prototype is freely available under a GNU Public
Licence (GPL).Serrano MartĂnez-Santos, N. (2014). Interactive Transcription of Old Text Documents [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37979TESI
- âŚ