36,152 research outputs found

    PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification

    Get PDF
    Papyrology is the discipline that studies texts written on ancient papyri. An important problem faced by papyrologists and, in general by paleographers, is to identify the writers, also known as scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been combined with computer-based tools to automatically measure quantities such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently-emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images. Some of these approaches have been used for feature extraction, other to classify writers with machine learning algorithms or deep learning systems. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field. For this reason, publishing a well-labeled dataset is always a challenge and a stimulus for the academic world as researchers can test their methods and then compare their results from the same starting point. In this paper, we propose a new dataset of handwriting on papyri for the task of writer identification. This dataset is derived directly from GRK-Papyri dataset and the samples are obtained with some enhancement image operation. This paper presents not only the details of the dataset but also the operation of resizing, rotation, background smoothing, and rows segmentation in order to overcome the difficulties posed by the image degradation of this dataset. It is prepared and made freely available for non-commercial research along with their confirmed ground-truth information related to the task of writer identification

    GR-RNN:Global-Context Residual Recurrent Neural Networks for Writer Identification

    Get PDF
    This paper presents an end-to-end neural network system to identify writers through handwritten word images, which jointly integrates global-context information and a sequence of local fragment-based features. The global-context information is extracted from the tail of the neural network by a global average pooling step. The sequence of local and fragment-based features is extracted from a low-level deep feature map which contains subtle information about the handwriting style. The spatial relationship between the sequence of fragments is modeled by the recurrent neural network (RNN) to strengthen the discriminative ability of the local fragment features. We leverage the complementary information between the global-context and local fragments, resulting in the proposed global-context residual recurrent neural network (GR-RNN) method. The proposed method is evaluated on four public data sets and experimental results demonstrate that it can provide state-of-the-art performance. In addition, the neural networks trained on gray-scale images provide better results than neural networks trained on binarized and contour images, indicating that texture information plays an important role for writer identification. The source code will be available: \url{https://github.com/shengfly/writer-identification}.Comment: To appear: Pattern Recognitio

    Ethnicity and the Writing of Medieval Scottish history

    Get PDF
    Historians have long tended to define medieval Scottish society in terms of interactions between ethnic groups. This approach was developed over the course of the long nineteenth century, a formative period for the study of medieval Scotland. At that time, many scholars based their analysis upon scientific principles, long since debunked, which held that medieval 'peoples' could only be understood in terms of 'full ethnic packages'. This approach was combined with a positivist historical narrative that defined Germanic Anglo-Saxons and Normans as the harbingers of advances of Civilisation. While the prejudices of that era have largely faded away, the modern discipline still relies all too often on a dualistic ethnic framework. This is particularly evident in a structure of periodisation that draws a clear line between the 'Celtic' eleventh century and the 'Norman' twelfth. Furthermore, dualistic oppositions based on ethnicity continue, particularly in discussions of the law, kingship, lordship and religion

    Retelling the Future: Don Juan Manuel's "Exenplo XI" and the Power of Fiction

    Get PDF
    In this paper I look at how “Exenplo XI” is both product and reflection of the various traditions and cultures of medieval Iberia and how Juan Manuel forges a new version of this story from these inherited traditions in order to showcase problems of concern to his fourteenth-century audience, namely, the tension between ecclesiastical and Andalusi systems of thought and their representatives and how the author’s manipulation of the frame and the power of fiction itself echoes Don Yllán’s manipulation of magic to test the dean’s mettle. Then I turn to the lessons of “Exenplo XI” regarding the transmission of knowledge and who controls it, as well as the function of speculative fiction and its ability to explore alternative realities and potential futures for both fictional audience (Conde Lucanor) and contemporary twenty-first-century readers

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method
    • …
    corecore