17 research outputs found

    Semi-supervised learning through adversary networks for baseline detection

    Get PDF
    International audienceThe aim of this paper is to propose a new strategy adapted to the semantic segmentation of document images in order to extract baselines. Inspired by the work of Grüning [7], we used a convolutional model with residual layers enriched by an attention mechanism, called ARU-Net, a post-processing for the agglomeration of predictions and a data augmentation to enrich the database. Then, to consolidate the ARU-Net and help explicitly model dependencies between feature maps, we added a module of "Squeeze and Excitation" as proposed by Hu et al. [9]. Finally, to exploit the amount of unrated data available, we used a semi-supervised learning, based on ARU-Net, through the use of adversary networks. This approach has shown some interesting predictive qualities, compared to Grüning's work, with easier processing and less task-specific error correction. The resulting performance improvement is a success

    Lesende Algorithmen: Projekt READ

    Get PDF
    This is a preprint. The accepted article appeared in: Das Mittelalter, Band 24, Heft 1, Seiten 224-227, ISSN (Online) 2196-6869, DOI: 10.1515/mial-2019-0016 The paper briefly introduces Transkribus and project READ. Focusing on applications of interest to medievalists

    Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks

    Full text link
    In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods

    Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering

    Full text link
    The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size, stroke width, loops, ductus, slant angles, and cursive ligatures. Previous work on labeled data with Hidden Markov models, support vector machines, and semi-supervised recurrent neural networks have provided moderate to high success. In this study, we successfully detect hand shifts in a historical manuscript through fuzzy soft clustering in combination with linear principal component analysis. This advance demonstrates the successful deployment of unsupervised methods for writer attribution of historical documents and forensic document analysis.Comment: 26 pages in total, 5 figures and 2 table
    corecore