18,475 research outputs found

    Into the Wide – Into the Deep: Manuscript Research in the Digital Age. Introduction

    Get PDF
    Manuscript research is a wide field of scholarship which is integrated in core disciplines such as history, philology, or library science. Yet manuscript research is also crucial in other fields such as archaeology, history of arts, musicology or Egyptology, to name but a few. For all these disciplines, manuscripts are fundamental sources. There are different approaches to different types of manuscripts, but questions and perspectives, methodologies and tools are often quite similar. Innovations and new research strategies from one discipline can be transferred to and adopted by others. This article is an introduction to the second volume of the anthology "Codicology and Palaeography in the Digital Age" and gives an overview of current aspects in the field of manuscript studies in both theory and practice by showing the relatedness of the contributions to the volume at hand as well as its predecessor. The texts are roughly assigned to five interrelated areas of manuscript research: (I) the photographic capturing of the manuscript surface, (II) the description of the manuscript for a catalogue, (III) the scientific examination of material aspects, (IV) the analysis of the script and (V) the deep encoding of the text itself

    A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

    Full text link
    Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval

    Computer-Aided Palaeography, Present and Future

    Get PDF
    The field of digital palaeography has received increasing attention in recent years, partly because palaeographers often seem subjective in their views and do not or cannot articulate their reasoning, thereby creating a field of authorities whose opinions are closed to debate. One response to this is to make palaeographical arguments more quantitative, although this approach is by no means accepted by the wider humanities community, with some arguing that handwriting is inherently unquantifiable. This paper therefore asks how palaeographical method might be made more objective and therefore more widely accepted by non-palaeographers while still answering critics within the field. Previous suggestions for objective methods before computing are considered first, and some of their shortcomings are discussed. Similar discussion in forensic document analysis is then introduced and is found relevant to palaeography, though with some reservations. New techniques of "digital" palaeography are then introduced; these have proven successful in forensic analysis and are becoming increasingly accepted there, but they have not yet found acceptance in the humanities communities. The reasons why are discussed, and some suggestions are made for how the software might be designed differently to achieve greater acceptance. Finally, a prototype framework is introduced which is designed to provide a common basis for experiments in "digital" palaeography, ideally enabling scholars to exchange quantitative data about scribal hands, exchange processes for generating this data, articulate both the results themselves and the processes used to produce them, and therefore to ground their arguments more firmly and perhaps find greater acceptance

    Who is Patrick? – Answers from the Saint Patrick's Confessio HyperStack. Supporting Digital Humanities, Copenhagen 17 - 18 November 2011, Conference Proceedings

    Get PDF
    Not everyone realizes that there are two Latin works, still surviving, that can definitely be attributed to Saint Patrick’s own authorship. On 14th September 2011 the Royal Irish Academy published his writings in a freely accessible form on line, both in the original Latin and in a variety of modern languages (including Irish). Designed to be of interest to the general public as well as to academic researchers, the Saint Patrick’s Confessio Hypertext Stack includes such features as digital images of the medieval manuscripts involved, a specially commissioned historical reconstruction that evocatively describes life in pre-Viking Ireland, articles, audio presentations, and some ten thousand internal and external digital links that make it truly a resource to be explored

    Recognizing Degraded Handwritten Characters

    Get PDF
    In this paper, Slavonic manuscripts from the 11th century written in Glagolitic script are investigated. State-of-the-art optical character recognition methods produce poor results for degraded handwritten document images. This is largely due to a lack of suitable results from basic pre-processing steps such as binarization and image segmentation. Therefore, a new, binarization-free approach will be presented that is independent of pre-processing deficiencies. It additionally incorporates local information in order to recognize also fragmented or faded characters. The proposed algorithm consists of two steps: character classification and character localization. Firstly scale invariant feature transform features are extracted and classified using support vector machines. On this basis interest points are clustered according to their spatial information. Then, characters are localized and eventually recognized by a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background noise, e.g. stains, tears, and faded characters

    Towards style-based dating of historical documents

    Get PDF

    Bearding Ritter von Köchel in his lair

    Get PDF
    As editor of the next iteration of the Köchel Catalogue, I have to deal with the current (sixth) edition’s Appendix C, devoted to "Doubtful and Misattributed Works." My goal is to reduce the potentially vast dimensions of that appendix to only those works for which some connection to Mozart cannot be ruled out. In the decades since 1964, when the current edition of Köchel was published, many of the works listed in Appendix C have been convincingly attributed to other composers. Other works therein can confidently be dismissed as never having had any meaningful connection to Mozart. Yet even after removing the reattributed and trivially misattributed works from the appendix, we are left with a handful of works that may possibly have had something to do with Mozart, even if clear evidence one way or the other remains elusive. One must, of course, be cautious in removing questionable and doubtful works from the catalogue, as the present case-study will illustrate. The work under consideration, catalogued as K6 Anh. C 9.07, is an unaccompanied piece for three or four voices with the text "Venerabilis barba capucinorum." ..

    The Palaeographical Method under the Light of a Digital Approach

    Get PDF
    This paper has the twofold aim of reflecting upon a humanities computing approach to palaeography, and of making such reflections - together with its related experimental results - fruitful at the implementation level. Firstly, the paper explores the methodological issues related to the use of a digital tool to support the palaeographical analysis of medieval handwriting. It claims that humanities computing methods can assist in making explicit those processes of the palaeographical research that encompass detailed analyses, in particular of the handwriting and, more generally, of other idiosyncratic features of written cultural artefacts. Thus, palaeographical tools are to be contextualised and used within a broader methodological framework where their role is to mediate the vision, the comparison, the representation, the analysis and the interpretation of these objects. Secondly, the paper attempts to evaluate the experimentations carried out with a specific software and, in so doing, to test a humanities computing approach to palaeography at a practical level, so as to direct future implementations. Some of these implementations have already been carried out by the current developers of the application in question with whom the author collaborates closely, while others are still in progress and in need of future iterative refinements
    corecore