54 research outputs found

    Brat2Viz: a tool and pipeline for visualizing narratives from annotated texts

    Get PDF
    Narrative Extraction from text is a complex task that starts by identifying a set of narrative elements (actors, events, times), and the semantic links between them (temporal, referential, semantic roles). The outcome is a structure or set of structures which can then be represented graphically, thus opening room for further and alternative exploration of the plot. Such visualization can also be useful during the on-going annotation process. Manual annotation of narratives can be a complex effort and the possibility offered by the Brat annotation tool of annotating directly on the text does not seem suciently helpful. In this paper, we propose Brat2Viz, a tool and a pipeline that displays visualization of narrative information annotated in Brat. Brat2Viz reads the annotation file of Brat, produces an intermediate representation in the declarative language DRS (Discourse Representation Structure), and from this obtains the visualization. Currently, we make available two visualization schemes: MSC (Message Sequence Chart) and Knowledge Graphs. The modularity of the pipeline enables the future extension to new annotation sources, different annotation schemes, and alternative visualizations or representations. We illustrate the pipeline using examples from an European Portuguese news corpus

    Proceedings of the Second Workshop on Annotation of Corpora for Research in the Humanities (ACRH-2). 29 November 2012, Lisbon, Portugal

    Get PDF
    Proceedings of the Second Workshop on Annotation of Corpora for Research in the Humanities (ACRH-2), held in Lisbon, Portugal on 29 November 2012

    Dr. Livingstone, I presume? Polishing of foreign character identification in literary texts

    Get PDF
    Character identification is a key element for many narrative-related tasks. To implement it, the baseform of the name of the character (or lemma) needs to be identified, so different appearances of the same character in the narrative could be aligned. In this paper we tackle this problem in translated texts (English–Finnish translation direction), where the challenge regarding lemmatizing foreign names in an agglutinative language appears. To solve this problem, we present and compare several methods. The results show that the method based on a search for the shortest version of the name proves to be the easiest, best performing (83.4% F1), and most resource-independent.</p

    Detecting protagonists in German plays around 1800 as a classification task

    Get PDF
    In this paper, we aim at identifying protagonists in plays automatically. To this end, we train a classifier using various features and investigate the importance of each feature. A challenging aspect here is that the number of spoken words for a character is a very strong baseline. We can show, however, that a) the stage presence of characters and b) topics used in their speech can help to detect protagonists even above the baseline

    Narrative Information Extraction with Non-Linear Natural Language Processing Pipelines

    Get PDF
    Computational narrative focuses on methods to algorithmically analyze, model, and generate narratives. Most current work in story generation, drama management or even literature analysis relies on manually authoring domain knowledge in some specific formal representation language, which is expensive to generate. In this dissertation we explore how to automatically extract narrative information from unannotated natural language text, how to evaluate the extraction process, how to improve the extraction process, and how to use the extracted information in story generation applications. As our application domain, we use Vladimir Propp's narrative theory and the corresponding Russian and Slavic folktales as our corpus. Our hypothesis is that incorporating narrative-level domain knowledge (i.e., Proppian theory) to core natural language processing (NLP) and information extraction can improve the performance of tasks (such as coreference resolution), and the extracted narrative information. We devised a non-linear information extraction pipeline framework which we implemented in Voz, our narrative information extraction system. Finally, we studied how to map the output of Voz to an intermediate computational narrative model and use it as input for an existing story generation system, thus further connecting existing work in NLP and computational narrative. As far as we know, it is the first end-to-end computational narrative system that can automatically process a corpus of unannotated natural language stories, extract explicit domain knowledge from them, and use it to generate new stories. Our user study results show that specific error introduced during the information extraction process can be mitigated downstream and have virtually no effect on the perceived quality of the generated stories compared to generating stories using handcrafted domain knowledge.Ph.D., Computer Science -- Drexel University, 201

    Report on the 2015 NSF Workshop on Unified Annotation Tooling

    Get PDF
    On March 30 & 31, 2015, an international group of twenty-three researchers with expertise in linguistic annotation convened in Sunny Isles Beach, Florida to discuss problems with and potential solutions for the state of linguistic annotation tooling. The participants comprised 14 researchers from the U.S. and 9 from outside the U.S., with 7 countries and 4 continents represented, and hailed from fields and specialties including computational linguistics, artificial intelligence, speech processing, multi-modal data processing, clinical & medical natural language processing, linguistics, documentary linguistics, sign-language linguistics, corpus linguistics, and the digital humanities. The motivating problem of the workshop was the balkanization of annotation tooling, namely, that even though linguistic annotation requires sophisticated tool support to efficiently generate high-quality data, the landscape of tools for the field is fractured, incompatible, inconsistent, and lacks key capabilities. The overall goal of the workshop was to chart the way forward, centering on five key questions: (1) What are the problems with current tool landscape? (2) What are the possible benefits of solving some or all of these problems? (3) What capabilities are most needed? (4) How should we go about implementing these capabilities? And, (5) How should we ensure longevity and sustainability of the solution? I surveyed the participants before their arrival, which provided significant raw material for ideas, and the workshop discussion itself resulted in identification of ten specific classes of problems, five sets of most-needed capabilities. Importantly, we identified annotation project managers in computational linguistics as the key recipients and users of any solution, thereby succinctly addressing questions about the scope and audience of potential solutions. We discussed management and sustainability of potential solutions at length. The participants agreed on sixteen recommendations for future work. This technical report contains a detailed discussion of all these topics, a point-by-point review of the discussion in the workshop as it unfolded, detailed information on the participants and their expertise, and the summarized data from the surveys

    A Scenario-directed Computational Framework To Aid Decision-making And Systems Development

    Get PDF
    Scenarios are narratives that illustrate future possibilities or existing systems, and help policy makers and system designers choose among alternative courses of action. Scenario-based decision-making crosses many domains and multiple perspectives. Domain-specic techniques for encoding, simulating, and manipulating scenarios exist, however there is no general-purpose scenario representation capable of supporting the wide spectrum of formality from executable simulation programs to free-form text to streaming media descriptions. The claim of this research is that there is a computer readable scenario framework that can capture the semantics of a problem domain and make scenarios an active part of decision making. The challenge is to define a representation for scenarios that supports a wide range of discussion and comprehension activities while remaining independent of content and access mechanisms. This dissertation describes a scenario ontology derived by examining alternate forms of narrative: thought experiments, mental models, case-based reasoning, use cases, design patterns, screenwriting, film-editing, intelligent agents, and other narrative domains. The scenario conceptual model was based on an analysis of forms of narrative and the activities of storytelling. This method separates what a narrative is from how it is used. The research contribution is the development of the hyperscenario framework. A hyperscenario is a scenario representation containing link structures for navigation between scenario elements. The hyperscenario framework consists of the scenario ontology, scenario grammar, and a scenario specification called Scenario Markup Language (SCML). The results of the web-enabled simulation experiment validate the improvement on decision-making due to the hyperscenario framework.Ph.D.Committee Chair: Moore, Melody; Committee Member: Dampier, David; Committee Member: Harrold, Mary Jean; Committee Member: Mark, Leo; Committee Member: Rugaber, Spence

    The Anglo-Scottish Ballad and its Imaginary Contexts

    Get PDF
    This is the first book to combine contemporary debates in ballad studies with the insights of modern textual scholarship. Just like canonical literature and music, the ballad should not be seen as a uniquely authentic item inextricably tied to a documented source, but rather as an unstable structure subject to the vagaries of production, reception, and editing. Among the matters addressed are topics central to the subject, including ballad origins, oral and printed transmission, sound and writing, agency and editing, and textual and melodic indeterminacy and instability. While drawing on the time-honoured materials of ballad studies, the book offers a theoretical framework for the discipline to complement the largely ethnographic approach that has dominated in recent decades. Primarily directed at the community of ballad and folk song scholars, the book will be of interest to researchers in several adjacent fields, including folklore, oral literature, ethnomusicology, and textual scholarship
    corecore