9,756 research outputs found

    OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts

    Full text link
    Historical manuscript processing poses challenges like limited annotated training data and novel class emergence. To address this, we propose a novel One-shot learning-based Text Spotting (OTS) approach that accurately and reliably spots novel characters with just one annotated support sample. Drawing inspiration from cognitive research, we introduce a spatial alignment module that finds, focuses on, and learns the most discriminative spatial regions in the query image based on one support image. Especially, since the low-resource spotting task often faces the problem of example imbalance, we propose a novel loss function called torus loss which can make the embedding space of distance metric more discriminative. Our approach is highly efficient and requires only a few training samples while exhibiting the remarkable ability to handle novel characters, and symbols. To enhance dataset diversity, a new manuscript dataset that contains the ancient Dongba hieroglyphics (DBH) is created. We conduct experiments on publicly available VML-HD, TKH, NC datasets, and the new proposed DBH dataset. The experimental results demonstrate that OTS outperforms the state-of-the-art methods in one-shot text spotting. Overall, our proposed method offers promising applications in the field of text spotting in historical manuscripts

    Efficient few-shot learning for pixel-precise handwritten document layout analysis

    Get PDF
    Layout analysis is a task of uttermost importance in ancient handwritten document analysis and represents a fundamental step toward the simplification of subsequent tasks such as optical character recognition and automatic transcription. However, many of the approaches adopted to solve this problem rely on a fully supervised learning paradigm. While these systems achieve very good performance on this task, the drawback is that pixel-precise text labeling of the entire training set is a very time-consuming process, which makes this type of information rarely available in a real-world scenario. In the present paper, we address this problem by proposing an efficient few-shot learning framework that achieves performances comparable to current state-of-the-art fully supervised methods on the publicly available DIVA-HisDB dataset.Comment: Accepted for publication at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 202

    Query by String word spotting based on character bi-gram indexing

    Full text link
    In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasetsComment: To be published in ICDAR201

    Web-based scientific exploration and analysis of 3D scanned cuneiform datasets for collaborative research

    Get PDF
    The three-dimensional cuneiform script is one of the oldest known writing systems and a central object of research in Ancient Near Eastern Studies and Hittitology. An important step towards the understanding of the cuneiform script is the provision of opportunities and tools for joint analysis. This paper presents an approach that contributes to this challenge: a collaborative compatible web-based scientific exploration and analysis of 3D scanned cuneiform fragments. The WebGL -based concept incorporates methods for compressed web-based content delivery of large 3D datasets and high quality visualization. To maximize accessibility and to promote acceptance of 3D techniques in the field of Hittitology, the introduced concept is integrated into the Hethitologie-Portal Mainz, an established leading online research resource in the field of Hittitology, which until now exclusively included 2D content. The paper shows that increasing the availability of 3D scanned archaeological data through a web-based interface can provide significant scientific value while at the same time finding a trade-off between copyright induced restrictions and scientific usability

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method
    • …
    corecore