17 research outputs found
Semi-supervised learning through adversary networks for baseline detection
International audienceThe aim of this paper is to propose a new strategy adapted to the semantic segmentation of document images in order to extract baselines. Inspired by the work of Grüning [7], we used a convolutional model with residual layers enriched by an attention mechanism, called ARU-Net, a post-processing for the agglomeration of predictions and a data augmentation to enrich the database. Then, to consolidate the ARU-Net and help explicitly model dependencies between feature maps, we added a module of "Squeeze and Excitation" as proposed by Hu et al. [9]. Finally, to exploit the amount of unrated data available, we used a semi-supervised learning, based on ARU-Net, through the use of adversary networks. This approach has shown some interesting predictive qualities, compared to Grüning's work, with easier processing and less task-specific error correction. The resulting performance improvement is a success
Lesende Algorithmen: Projekt READ
This is a preprint. The accepted article appeared in: Das Mittelalter, Band 24, Heft 1, Seiten 224-227, ISSN (Online) 2196-6869, DOI: 10.1515/mial-2019-0016
The paper briefly introduces Transkribus and project READ. Focusing on applications of interest to medievalists
Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks
In this paper, we introduce a fully convolutional network for the document
layout analysis task. While state-of-the-art methods are using models
pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped
model trained from scratch for detecting objects from historical documents. We
consider the line segmentation task and more generally the layout analysis
problem as a pixel-wise classification task then our model outputs a
pixel-labeling of the input images. We show that Doc-UFCN outperforms
state-of-the-art methods on various datasets and also demonstrate that the
pre-trained parts on natural scene images are not required to reach good
results. In addition, we show that pre-training on multiple document datasets
can improve the performances. We evaluate the models using various metrics to
have a fair and complete comparison between the methods
Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering
The forensic attribution of the handwriting in a digitized document to
multiple scribes is a challenging problem of high dimensionality. Unique
handwriting styles may be dissimilar in a blend of several factors including
character size, stroke width, loops, ductus, slant angles, and cursive
ligatures. Previous work on labeled data with Hidden Markov models, support
vector machines, and semi-supervised recurrent neural networks have provided
moderate to high success. In this study, we successfully detect hand shifts in
a historical manuscript through fuzzy soft clustering in combination with
linear principal component analysis. This advance demonstrates the successful
deployment of unsupervised methods for writer attribution of historical
documents and forensic document analysis.Comment: 26 pages in total, 5 figures and 2 table