5 research outputs found
Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering
The forensic attribution of the handwriting in a digitized document to
multiple scribes is a challenging problem of high dimensionality. Unique
handwriting styles may be dissimilar in a blend of several factors including
character size, stroke width, loops, ductus, slant angles, and cursive
ligatures. Previous work on labeled data with Hidden Markov models, support
vector machines, and semi-supervised recurrent neural networks have provided
moderate to high success. In this study, we successfully detect hand shifts in
a historical manuscript through fuzzy soft clustering in combination with
linear principal component analysis. This advance demonstrates the successful
deployment of unsupervised methods for writer attribution of historical
documents and forensic document analysis.Comment: 26 pages in total, 5 figures and 2 table
Leveraging Expert Models for Training Deep Neural Networks in Scarce Data Domains: Application to Offline Handwritten Signature Verification
This paper introduces a novel approach to leverage the knowledge of existing
expert models for training new Convolutional Neural Networks, on domains where
task-specific data are limited or unavailable. The presented scheme is applied
in offline handwritten signature verification (OffSV) which, akin to other
biometric applications, suffers from inherent data limitations due to
regulatory restrictions. The proposed Student-Teacher (S-T) configuration
utilizes feature-based knowledge distillation (FKD), combining graph-based
similarity for local activations with global similarity measures to supervise
student's training, using only handwritten text data. Remarkably, the models
trained using this technique exhibit comparable, if not superior, performance
to the teacher model across three popular signature datasets. More importantly,
these results are attained without employing any signatures during the feature
extraction training process. This study demonstrates the efficacy of leveraging
existing expert models to overcome data scarcity challenges in OffSV and
potentially other related domains
Learning features for offline handwritten signature verification
Handwritten signatures are the most socially and legally accepted means for identifying a person. Over the last few decades, several researchers have approached the problem of automating their recognition, using a variety of techniques from machine learning and pattern recognition. In particular, most of the research effort has been devoted to obtaining good feature representations for signatures, by designing new feature extractors, as well as experimenting with feature extractors developed for other purposes. To this end, researchers have used insights from graphology, computer vision, signal processing, among other areas. In spite of the advancements in the field, building classifiers that can separate between genuine signatures and skilled forgeries (forgeries made targeting a particular individual) is still an open research problem.
In this thesis, we propose to address this problem from another perspective, by learning the feature representations directly from signature images. The hypothesis is that, in the absence of a good model of the data generation process, it is better to learn the features from data. As a first contribution, we propose a method to learn Writer-Independent features using a surrogate objective, followed by training Writer-Dependent classifiers using the learned features. Furthermore, we define an extension that allows leveraging the knowledge of skilled forgeries (from a subset of users) in the feature learning process. We observed that such features generalize well to new users, obtaining state-of-the-art results on four widely used datasets in the literature.
As a second contribution, we investigate three issues of signature verification systems: (i) learning a fixed-sized vector representation for signatures of varied size; (ii) analyzing the impact of the resolution of the scanned signatures in system performance and (iii) how features generalize to new operating conditions with and without fine-tuning. We propose methods to handle signatures of varied size and our experiments show results comparable to state-of-theart while removing the requirement that all input images have the same size.
As a third contribution, we propose to formulate the problem of signature verification as a meta-learning problem. This formulation also learns directly from signatures images, and allows the direct optimization of the objective (separating genuine signatures and skilled forgeries), instead of relying on surrogate objectives for learning the features. Furthermore, we show that this method is naturally extended to formulate the adaptation (training) for new users as one-class classification.
As a fourth contribution, we analyze the limitations of these systems in an Adversarial Machine Learning setting, where an active adversary attempts to disrupt the system. We characterize new threats posed by Adversarial Examples on a taxonomy of threats to biometric systems, and conduct extensive experiments to evaluate the success of attacks under different scenarios of attacker’s goals and knowledge of the system under attack. We observed that both systems that rely on handcrafted features, as well as those using learned features, are susceptible to adversarial attacks in a wide range of scenarios, including partial-knowledge scenarios where the attacker does not have full access to the trained classifiers. While some defenses proposed in the literature increase the robustness of the systems, this research highlights the scenarios where such systems are still vulnerable