32,645 research outputs found

    A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

    Full text link
    Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval

    Characterization and digital restauration of XIV-XV centuries written parchments by means of non-destructive techniques. Three case studies

    Get PDF
    Parchment is the primary writing medium of the majority of documents with cultural importance. Unfortunately, this material suffers of several mechanisms of degradation that affect its chemical-physical structure and the readability of text. Due to the unique and delicate character of these objects, the use of nondestructive techniques is mandatory. In this work, three partially degraded handwritten parchments dating back to the XIV-XV centuries were analyzed by means of X-ray fluorescence spectroscopy, µ-ATR Fourier transform infrared spectroscopy, and reflectance and UV-induced fluorescence spectroscopy. 'e elemental and molecular results provided the identification of the inks, pigments, and superficial treatments. In particular, all manuscripts have been written with iron gall inks, while the capital letters have been realized with cinnabar and azurite. Furthermore, multispectral UV fluorescence imaging and multispectral VIS-NIR imaging proved to be a good approach for the digital restoration of manuscripts that suffer from the loss of inked areas or from the presence of brown spotting. Indeed, using ultraviolet radiation and collecting the images at different spectral ranges is possible to enhance the readability of the text, while by illuminating with visible light and by collecting the images at longer wavelengths, the hiding effect of brown spots can be attenuated

    Characterization and digital restauration of XIV-XV centuries written parchments by means of non-destructive techniques. Three case studies

    Get PDF
    Parchment is the primary writing medium of the majority of documents with cultural importance. Unfortunately, this material suffers of several mechanisms of degradation that affect its chemical-physical structure and the readability of text. Due to the unique and delicate character of these objects, the use of nondestructive techniques is mandatory. In this work, three partially degraded handwritten parchments dating back to the XIV-XV centuries were analyzed by means of X-ray fluorescence spectroscopy, µ-ATR Fourier transform infrared spectroscopy, and reflectance and UV-induced fluorescence spectroscopy. 'e elemental and molecular results provided the identification of the inks, pigments, and superficial treatments. In particular, all manuscripts have been written with iron gall inks, while the capital letters have been realized with cinnabar and azurite. Furthermore, multispectral UV fluorescence imaging and multispectral VIS-NIR imaging proved to be a good approach for the digital restoration of manuscripts that suffer from the loss of inked areas or from the presence of brown spotting. Indeed, using ultraviolet radiation and collecting the images at different spectral ranges is possible to enhance the readability of the text, while by illuminating with visible light and by collecting the images at longer wavelengths, the hiding effect of brown spots can be attenuated

    Searchin’ His Eyes, Lookin’ for Traces: Piri Reis’ World Map of 1513 & its Islamic Iconographic Connections (A Reading Through Bagdat 334 and Proust)

    Full text link
    The remnant of the 1513 world map of the Ottoman corsair (and later admiral) Muhiddin Piri, a.k.a. Piri Reis, with its focus on the Atlantic and the New World can be ranked as one of the most famous and controversial maps in the annals of the history of cartography. Following its discovery at Topkapi Palace in 1929, this early modern Ottoman map has raised baffling questions regarding its fons et origo. Some scholars posited ancient sea kings or aliens from outer space as the original creators; while the influence of Columbus’ own map and early Renaissance cartographers tantalized others. One question that remains unanswered is how Islamic cartography influenced Piri Reis’ work. This paper presents hitherto unnoticed iconographical connections between the classical Islamic mapping tradition and the Piri Reis map

    The Dunhuang chinese sky: a comprehensive study of the oldest known star atlas

    Full text link
    This paper presents an analysis of the star atlas included in the medieval Chinese manuscript (Or.8210/S.3326), discovered in 1907 by the archaeologist Aurel Stein at the Silk Road town of Dunhuang and now held in the British Library. Although partially studied by a few Chinese scholars, it has never been fully displayed and discussed in the Western world. This set of sky maps (12 hour angle maps in quasi-cylindrical projection and a circumpolar map in azimuthal projection), displaying the full sky visible from the Northern hemisphere, is up to now the oldest complete preserved star atlas from any civilisation. It is also the first known pictorial representation of the quasi-totality of the Chinese constellations. This paper describes the history of the physical object - a roll of thin paper drawn with ink. We analyse the stellar content of each map (1339 stars, 257 asterisms) and the texts associated with the maps. We establish the precision with which the maps are drawn (1.5 to 4 degrees for the brightest stars) and examine the type of projections used. We conclude that precise mathematical methods were used to produce the atlas. We also discuss the dating of the manuscript and its possible author and confirm the dates 649-684 (early Tang dynasty) as most probable based on available evidence. This is at variance with a prior estimate around +940. Finally we present a brief comparison with later sky maps, both in China and in Europe.Comment: 19 pages, 5 Tables, 8 Figure

    Recognizing Degraded Handwritten Characters

    Get PDF
    In this paper, Slavonic manuscripts from the 11th century written in Glagolitic script are investigated. State-of-the-art optical character recognition methods produce poor results for degraded handwritten document images. This is largely due to a lack of suitable results from basic pre-processing steps such as binarization and image segmentation. Therefore, a new, binarization-free approach will be presented that is independent of pre-processing deficiencies. It additionally incorporates local information in order to recognize also fragmented or faded characters. The proposed algorithm consists of two steps: character classification and character localization. Firstly scale invariant feature transform features are extracted and classified using support vector machines. On this basis interest points are clustered according to their spatial information. Then, characters are localized and eventually recognized by a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background noise, e.g. stains, tears, and faded characters

    Putting the Text back into Context: A Codicological Approach to Manuscript Transcription

    Get PDF
    Textual scholars have tended to produce editions which present the text without its manuscript context. Even though digital editions now often present single-witness editions with facsimiles of the manuscripts, nevertheless the text itself is still transcribed and represented as a linguistic object rather than a physical one. Indeed, this is explicitly stated as the theoretical basis for the de facto standard of markup for digital texts: the Guidelines of the Text Encoding Initiative (TEI). These explicitly treat texts as semantic units such as paragraphs, sentences, verses and so on, rather than physical elements such as pages, openings, or surfaces, and some scholars have argued that this is the only viable model for representing texts. In contrast, this chapter presents arguments for considering the document as a physical object in the markup of texts. The theoretical arguments of what constitutes a text are first reviewed, with emphasis on those used by the TEI and other theoreticians of digital markup. A series of cases is then given in which a document-centric approach may be desirable, with both modern and medieval examples. Finally a step forward in this direction is raised, namely the results of the Genetic Edition Working Group in the Manuscript Special Interest Group of the TEI: this includes a proposed standard for documentary markup, whereby aspects of codicology and mise en page can be included in digital editions, putting the text back into its manuscript context

    Recent developments in New Testament textual criticism

    Get PDF
    This is a preprint version of an article published in Early Christianity 2.2 (2011). \ud \ud The article provides an overview of recent developments in New Testament Textual Criticism. The four sections cover editions, manuscripts, citational evidence and methodology. Particular attention is paid to the Editio Critica Maior, the development of electronic resources, newly discovered manuscripts, and the Coherence Based Genealogical Method

    The Orality of a Silent Age: The Place of Orality in Medieval Studies

    Get PDF
    'The Orality of a Silent Age: The Place of Orality in Medieval Studies' uses a brief survey of current work on Old English poetry as the point of departure for arguing that although useful, the concepts of orality and literacy have, in medieval studies, been extended further beyond their literal referents of spoken and written communication than is heuristically useful. Recent emphasis on literate methods and contexts for the writing of our surviving Anglo-Saxon poetry, in contradistinction to the previous emphasis on oral ones, provides the basis for this criticism. Despite a significant amount of revisionist work, the concept of orality remains something of a vortex into which a range of only party related issues have been sucked: authorial originality/communal property; impromptu composition/meditated composition; authorial and audience alienation/immediacy. The relevance of orality to these issues is not in dispute; the problem is that they do not vary along specifically oral/literate axes. The article suggests that this is symptomatic of a wider modernist discourse in medieval studies whereby modern, literate society is (implicitly) contrasted with medieval, oral society: the extension of the orality/literacy axis beyond its literal reference has to some extent facilitated the perpetuation of an earlier contrast between primitivity and modernity which deserves still to be questioned and disputed. Pruning back our conceptions of the oral and the literate to their stricter denotations, we might hope to see more clearly what areas of medieval studies would benefit from alternative interpretations
    • …
    corecore