32,645 research outputs found
A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis
Automatic analysis of scanned historical documents comprises a wide range of
image analysis tasks, which are often challenging for machine learning due to a
lack of human-annotated learning samples. With the advent of deep neural
networks, a promising way to cope with the lack of training data is to
pre-train models on images from a different domain and then fine-tune them on
historical documents. In the current research, a typical example of such
cross-domain transfer learning is the use of neural networks that have been
pre-trained on the ImageNet database for object recognition. It remains a
mostly open question whether or not this pre-training helps to analyse
historical documents, which have fundamentally different image properties when
compared with ImageNet. In this paper, we present a comprehensive empirical
survey on the effect of ImageNet pre-training for diverse historical document
analysis tasks, including character recognition, style classification,
manuscript dating, semantic segmentation, and content-based retrieval. While we
obtain mixed results for semantic segmentation at pixel-level, we observe a
clear trend across different network architectures that ImageNet pre-training
has a positive effect on classification as well as content-based retrieval
Characterization and digital restauration of XIV-XV centuries written parchments by means of non-destructive techniques. Three case studies
Parchment is the primary writing medium of the majority of documents with cultural importance. Unfortunately, this material suffers of several mechanisms of degradation that affect its chemical-physical structure and the readability of text. Due to the unique and delicate character of these objects, the use of nondestructive techniques is mandatory. In this work, three partially degraded
handwritten parchments dating back to the XIV-XV centuries were analyzed by means of X-ray fluorescence spectroscopy, µ-ATR Fourier transform infrared spectroscopy, and reflectance and UV-induced fluorescence spectroscopy. 'e elemental and molecular results provided the identification of the inks, pigments, and superficial treatments. In particular, all manuscripts have been written with iron gall inks, while the capital letters have been realized with cinnabar and azurite. Furthermore, multispectral UV fluorescence imaging and multispectral VIS-NIR imaging proved to be a good approach for the digital restoration of manuscripts that suffer from the loss of inked areas or from the presence of brown spotting. Indeed, using ultraviolet radiation and collecting the images at different spectral ranges is possible to enhance the readability of the text, while by illuminating with visible light and by collecting the images at longer wavelengths, the hiding effect of brown spots can be attenuated
Characterization and digital restauration of XIV-XV centuries written parchments by means of non-destructive techniques. Three case studies
Parchment is the primary writing medium of the majority of documents with cultural importance. Unfortunately, this material suffers of several mechanisms of degradation that affect its chemical-physical structure and the readability of text. Due to the unique and delicate character of these objects, the use of nondestructive techniques is mandatory. In this work, three partially degraded
handwritten parchments dating back to the XIV-XV centuries were analyzed by means of X-ray fluorescence spectroscopy, µ-ATR Fourier transform infrared spectroscopy, and reflectance and UV-induced fluorescence spectroscopy. 'e elemental and molecular results provided the identification of the inks, pigments, and superficial treatments. In particular, all manuscripts have been written with iron gall inks, while the capital letters have been realized with cinnabar and azurite. Furthermore, multispectral UV fluorescence imaging and multispectral VIS-NIR imaging proved to be a good approach for the digital restoration of manuscripts that suffer from the loss of inked areas or from the presence of brown spotting. Indeed, using ultraviolet radiation and collecting the images at different spectral ranges is possible to enhance the readability of the text, while by illuminating with visible light and by collecting the images at longer wavelengths, the hiding effect of brown spots can be attenuated
Searchin’ His Eyes, Lookin’ for Traces: Piri Reis’ World Map of 1513 & its Islamic Iconographic Connections (A Reading Through Bagdat 334 and Proust)
The remnant of the 1513 world map of the Ottoman corsair (and later admiral) Muhiddin Piri, a.k.a. Piri Reis, with its focus on the Atlantic and the New World can be ranked as one of the most famous and controversial maps in the annals of the history of cartography. Following its discovery at Topkapi Palace in 1929, this early modern Ottoman map has raised baffling questions regarding its fons et origo. Some scholars posited ancient sea kings or aliens from outer space as the original creators; while the influence of Columbus’ own map and early Renaissance cartographers tantalized others. One question that remains unanswered is how Islamic cartography influenced Piri Reis’ work. This paper presents hitherto unnoticed iconographical connections between the classical Islamic mapping tradition and the Piri Reis map
The Dunhuang chinese sky: a comprehensive study of the oldest known star atlas
This paper presents an analysis of the star atlas included in the medieval
Chinese manuscript (Or.8210/S.3326), discovered in 1907 by the archaeologist
Aurel Stein at the Silk Road town of Dunhuang and now held in the British
Library. Although partially studied by a few Chinese scholars, it has never
been fully displayed and discussed in the Western world. This set of sky maps
(12 hour angle maps in quasi-cylindrical projection and a circumpolar map in
azimuthal projection), displaying the full sky visible from the Northern
hemisphere, is up to now the oldest complete preserved star atlas from any
civilisation. It is also the first known pictorial representation of the
quasi-totality of the Chinese constellations. This paper describes the history
of the physical object - a roll of thin paper drawn with ink. We analyse the
stellar content of each map (1339 stars, 257 asterisms) and the texts
associated with the maps. We establish the precision with which the maps are
drawn (1.5 to 4 degrees for the brightest stars) and examine the type of
projections used. We conclude that precise mathematical methods were used to
produce the atlas. We also discuss the dating of the manuscript and its
possible author and confirm the dates 649-684 (early Tang dynasty) as most
probable based on available evidence. This is at variance with a prior estimate
around +940. Finally we present a brief comparison with later sky maps, both in
China and in Europe.Comment: 19 pages, 5 Tables, 8 Figure
Recognizing Degraded Handwritten Characters
In this paper, Slavonic manuscripts from the 11th
century written in Glagolitic script are
investigated. State-of-the-art optical character recognition methods produce poor results
for degraded handwritten document images. This is largely due to a lack of suitable
results from basic pre-processing steps such as binarization and image segmentation.
Therefore, a new, binarization-free approach will be presented that is independent of
pre-processing deficiencies. It additionally incorporates local information in order to
recognize also fragmented or faded characters. The proposed algorithm consists of
two steps: character classification and character localization. Firstly scale invariant
feature transform features are extracted and classified using support vector machines.
On this basis interest points are clustered according to their spatial information. Then,
characters are localized and eventually recognized by a weighted voting scheme of
pre-classified local descriptors. Preliminary results show that the proposed system can
handle highly degraded manuscript images with background noise, e.g. stains, tears,
and faded characters
Putting the Text back into Context: A Codicological Approach to Manuscript Transcription
Textual scholars have tended to produce editions which present the text without its
manuscript context. Even though digital editions now often present single-witness
editions with facsimiles of the manuscripts, nevertheless the text itself is still transcribed
and represented as a linguistic object rather than a physical one. Indeed, this is explicitly
stated as the theoretical basis for the de facto standard of markup for digital texts: the
Guidelines of the Text Encoding Initiative (TEI). These explicitly treat texts as semantic
units such as paragraphs, sentences, verses and so on, rather than physical elements
such as pages, openings, or surfaces, and some scholars have argued that this is the only
viable model for representing texts. In contrast, this chapter presents arguments for
considering the document as a physical object in the markup of texts. The theoretical
arguments of what constitutes a text are first reviewed, with emphasis on those used
by the TEI and other theoreticians of digital markup. A series of cases is then given in
which a document-centric approach may be desirable, with both modern and medieval
examples. Finally a step forward in this direction is raised, namely the results of
the Genetic Edition Working Group in the Manuscript Special Interest Group of the
TEI: this includes a proposed standard for documentary markup, whereby aspects of
codicology and mise en page can be included in digital editions, putting the text back
into its manuscript context
Recent developments in New Testament textual criticism
This is a preprint version of an article published in Early Christianity 2.2 (2011). \ud
\ud
The article provides an overview of recent developments in New Testament Textual Criticism. The four sections cover editions, manuscripts, citational evidence and methodology. Particular attention is paid to the Editio Critica Maior, the development of electronic resources, newly discovered manuscripts, and the Coherence Based Genealogical Method
The Orality of a Silent Age: The Place of Orality in Medieval Studies
'The Orality of a Silent Age: The Place of Orality in Medieval Studies' uses a brief survey of current work on Old English poetry as the point of departure for arguing that although useful, the concepts of orality and literacy have, in medieval studies, been extended further beyond their literal referents of spoken and written communication than is heuristically useful. Recent emphasis on literate methods and contexts for the writing of our surviving Anglo-Saxon poetry, in contradistinction to the previous emphasis on oral ones, provides the basis for this criticism. Despite a significant amount of revisionist work, the concept of orality remains something of a vortex into which a range of only party related issues have been sucked: authorial originality/communal property; impromptu composition/meditated composition; authorial and audience alienation/immediacy. The relevance of orality to these issues is not in dispute; the problem is that they do not vary along specifically oral/literate axes. The article suggests that this is symptomatic of a wider modernist discourse in medieval studies whereby modern, literate society is (implicitly) contrasted with medieval, oral society: the extension of the orality/literacy axis beyond its literal reference has to some extent facilitated the perpetuation of an earlier contrast between primitivity and modernity which deserves still to be questioned and disputed. Pruning back our conceptions of the oral and the literate to their stricter denotations, we might hope to see more clearly what areas of medieval studies would benefit from alternative interpretations
- …