771 research outputs found

    “Standing-off Trees and Graphs”: On the Affordance of Technologies for the Assertive Edition

    Get PDF
    Starting from the observation that the existing models of digital scholarly editions can be expressed in many technologies, this paper goes beyond the simple opposition of ‘XML’ and ‘graph’, It studies the implicit context of the technologies as applied to digital scholarly editions: embedded mark-up in XML/TEI trees, graph representa- tions in RDF, and stand-off annotation as realised in annotation tools widely used for information extraction. It describes the affordances of the encoding methods offered. It takes as a test case the “assertive edition” (Vogeler 2019), in which the text is considered in a double role: as palaeographical and linguistic phenomenon, and as a representation of information. It comes to the conclusion that the affordances of XML help to detect sequential and hierarchical properties of a text, while those of RDF best cover the representation of knowledge as semantic networks of statements. The relationship between them can be expressed by the metaphor of ‘layers’, for which stand-off annotation technologies seem to be best fitted. However, there is no standardised technical formalism to create stand-off annotations beyond graphical tools sharing interface elements. The contribution concludes with the call for the acceptance of the advantages of each technology, and for efforts to be made to discuss the best way to combine these technologies

    Scholarly Music Editions as Graph: Semantic Modelling of the Anton Webern Gesamtausgabe

    Get PDF
    This paper presents a first draft of the ongoing research at the Anton Webern Gesamt- ausgabe (Basel, CH) to apply RDF-based semantic models for the purpose of a scholarly digital music edition. A brief overview of different historical positions to approach music from a graph-theoretical perspective is followed by a list of music- related and other RDF vocabularies that may support this goal, such as MusicOWL, DoReMus, CIDOC CRMinf, or the NIE-INE ontologies. Using the example of some of Webern’s sketches for two drafted Goethe settings (M306 & M307), a preliminary graph-based model for philological knowledge and processes is envisioned, which incorporates existing ontologies from the context of cultural heritage and music. Finally, possible use-cases, and the consequences of such an approach to scholarly music editions, are discussed

    Introduction

    Get PDF

    Scholarly Music Editions as Graph: Semantic Modelling of the Anton Webern Gesamtausgabe

    Get PDF
    This paper presents a first draft of the ongoing research at the Anton Webern Gesamtausgabe (Basel, CH) to apply RDF-based semantic models for the purpose of a scholarly digital music edition. A brief overview of different historical positions to approach music from a graph-theoretical perspective is followed by a list of music-related and other RDF vocabularies that may support this goal, such as MusicOWL, DoReMus, CIDOC CRMinf, or the NIE-INE ontologies. Using the example of some of Webern's sketches for two drafted Goethe settings (M306 & M307), a preliminary graph-based model for philological knowledge and processes is envisioned, which incorporates existing ontologies from the context of cultural heritage and music. Finally, possible use-cases, and the consequences of such an approach to scholarly music editions, are discussed

    Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

    Full text link
    The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance

    Towards Resolution Services for Text URIs

    Get PDF
    In this paper we address the lack of fully resolvable URIs for texts and their citable units in the currently emerging Graph of Ancient World Data. We identify three main architectural components that are required to provide resolution services for text URIs: 1) a registry of text services; 2) an identifier resolution service; 3) a document metadata scheme, to represent the relations between texts in the registry, as well as between these texts and related external resources (e.g. library catalogues). After presenting some of the use cases a central registry providing resolvable URIs for texts would enable, we discuss in detail each component. We conclude by considering three examples where the proposed document metadata scheme is used to describe digital texts; this scheme contains a minimum yet extendable set of metadata that can be used to explore and aggregate texts coming from a network of distributed repositories

    Modelling Medieval Vagueness

    Get PDF
    The project An Agile Approach Towards Computational Modeling of Historiographical Uncertainty is building a taxonomy of historiographical uncertainty. We are focusing on early medieval texts as our case studies, because they are characterised by a high degree of “high stakes” uncertainty and a varied historiography characterised by a vivid debate. The additional factor of the manuscript text-transmission ensues that also the material aspect of the textual study will be covered in our attempt to build an adaptable taxonomy of historiographical uncertainty. Computational humanities need a robust methodological platform, that can be applied to a wide variety of projects. Uncertainty in general and geographical uncertainty in particular stand as the crucial aspects of this platform. We investigate a methodology of visualising geographical locales in historical texts and their historiographies that explicitly models uncertainty in

    Towards a Corpus of Historical German Plays with Emotion Annotations

    Get PDF
    In this paper, we present first work-in-progress annotation results of a project investigating computational methods of emotion analysis for historical German plays around 1800. We report on the development of an annotation scheme focussing on the annotation of emotions that are important from a literary studies perspective for this time span as well as on the annotation process we have developed. We annotate emotions expressed or attributed by characters of the plays in the written texts. The scheme consists of 13 hierarchically structured emotion concepts as well as the source (who experiences or attributes the emotion) and target (who or what is the emotion directed towards). We have conducted the annotation of five example plays of our corpus with two annotators per play and report on annotation distributions and agreement statistics. We were able to collect over 6,500 emotion annotations and identified a fair agreement for most concepts around a ?-value of 0.4. We discuss how we plan to improve annotator consistency and continue our work. The results also have implications for similar projects in the context of Digital Humanities

    Modelling Cross-Document Interdependencies in Medieval Charters of the St. Katharinenspital in Regensburg

    Get PDF
    To overcome the limitations of structural XML mark-up, graph-based data models and graph databases, as well as event-based ontologies like CIDOC-CRM (FORTH-ICS 2018) have been considered for the creation of digital editions. We apply the graph-based approach to model charter regests and extend it with the CIDOC-CRM ontology, as it allows us to integrate information from different sources into a flexible data model. By implementing the ontology within the Neo4j graph database (Neo4j 2018) we create a sustainable data source that allows explorative search queries and finally, the integration of the database in various technical systems. Our use case are the charters from the St. Katharinenspital, a former medieval hospital in Regensburg, Germany. By analysing charter abstracts with natural language processing (NLP) methods and using additional data sources related to the charters, we generate additional metadata. The extracted information allows the modelling of cross-document interdependencies of charter regests and their related entities. Building upon this, we develop an exploratory web application that allows to investigate a graph-based digital edition. Thereby, each entity is displayed in its unique context, i.e., it is shown together with its related entities (next neighbours) in the graph. We use this to enhance the result lists of a full-text search, and to generate entity-specific detail pages

    Connecting TEI Content Into an Ontology of the Editorial Domain

    Get PDF
    We argued elsewhere that, in order to support interoperable annotations, editions should provide machine-readable identifiers for text and text fragments, as well as information about the text fragments’ type and structure. That is to say, they should be embedded in a Linked Open Data context that facilitates interchange and interpretation of annotation. In this article, assuming a TEI context, we consider the practical question of how the relevant RDF triples are to be derived. How is the edition to know which URIs are to be assigned to which elements in the XML hierarchy, and what are the relevant classes and properties? We discuss different options. Our preference is to generate the relevant triples upon ingestion of the XML file in a version control system and then to store the triples in the TEI xenodata element. We briefly consider situations in which cases the fine-grained annotation that we want to facilitate might be appropriate, or not
    • 

    corecore