1,623 research outputs found

    Event-based Access to Historical Italian War Memoirs

    Full text link
    The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events which have happened in the past; among these, memoirs are a very common type of primary source. In this paper, we present an approach for extracting information from Italian historical war memoirs and turning it into structured knowledge. This is based on the semantic notions of events, participants and roles. We evaluate quantitatively each of the key-steps of our approach and provide a graph-based representation of the extracted knowledge, which allows to move between a Close and a Distant Reading of the collection.Comment: 23 pages, 6 figure

    GermEval 2014 Named Entity Recognition Shared Task: Companion Paper

    Get PDF
    This paper describes the GermEval 2014 Named Entity Recognition (NER) Shared Task workshop at KONVENS. It provides background information on the motivation of this task, the data-set, the evaluation method, and an overview of the participating systems, followed by a discussion of their results. In contrast to previous NER tasks, the GermEval 2014 edition uses an extended tagset to account for derivatives of names and tokens that contain name parts. Further, nested named entities had to be predicted, i.e. names that contain other names. The eleven participating teams employed a wide range of techniques in their systems. The most successful systems used state-of-the- art machine learning methods, combined with some knowledge-based features in hybrid systems

    GermEval 2014 Named Entity Recognition Shared Task: Companion Paper

    Get PDF
    This paper describes the GermEval 2014 Named Entity Recognition (NER) Shared Task workshop at KONVENS. It provides background information on the motivation of this task, the data-set, the evaluation method, and an overview of the participating systems, followed by a discussion of their results. In contrast to previous NER tasks, the GermEval 2014 edition uses an extended tagset to account for derivatives of names and tokens that contain name parts. Further, nested named entities had to be predicted, i.e. names that contain other names. The eleven participating teams employed a wide range of techniques in their systems. The most successful systems used state-of-the- art machine learning methods, combined with some knowledge-based features in hybrid systems

    Generating and visualizing a soccer knowledge base

    Get PDF
    This demo abstract describes the SmartWeb Ontology-based Information Extraction System (SOBIE). A key feature of SOBIE is that all information is extracted and stored with respect to the SmartWeb ontology. In this way, other components of the systems, which use the same ontology, can access this information in a straightforward way. We will show how information extracted by SOBIE is visualized within its original context, thus enhancing the browsing experience of the end user

    Alts, Abbreviations, and AKAs:historical onomastic variation and automated named entity recognition

    Get PDF
    The accurate automated identification of named places is a major concern for scholars in the digital humanities, and especially for those engaged in research that depends upon the gazetteer-led recognition of specific aspects. The field of onomastics examines the linguistic roots and historical development of names, which have for the most part only standardised into single officially recognised forms since the late nineteenth century. Even slight spelling variations can introduce errors in geotagging techniques, and these differences in place-name spellings are thus vital considerations when seeking high rates of correct geospatial identification in historical texts. This article offers an overview of typical name-based variation that can cause issues in the accurate geotagging of any historical resource. The article argues that the careful study and documentation of these variations can assist in the development of more complete onymic records, which in turn may inform geotaggers through a cycle of variational recognition. It demonstrates how patterns in regional naming variation and development, across both specific and generic name elements, can be identified through the historical records of each known location. The article uses examples taken from a digitised corpus of writing about the English Lake District, a collection of 80 texts that date from between 1622 and 1900. Four of the more complex spelling-based problems encountered during the creation of a manual gazetteer for this corpus are examined. Specifically, the article demonstrates how and why such variation must be expected, particularly in the years preceding the standardisation of place-name spellings. It suggests how procedural developments may be undertaken to account for such georeferential issues in the Named Entity Recognition strategies employed by future projects. Similarly, the benefits of such multi-genre corpora to assist in completing onomastic records is also shown through examples of new name forms discovered for prominent sites in the Lake District. This focus is accompanied by a discussion of the influence of literary works on place-name standardisation – an aspect not typically accounted for in traditional onomastic study – to illustrate the extent to which authorial interests in regional toponymic histories can influence linguistic development
    • …
    corecore