4 research outputs found
Narrative Information Extraction with Non-Linear Natural Language Processing Pipelines
Computational narrative focuses on methods to algorithmically analyze, model, and generate narratives. Most current work in story generation, drama management or even literature analysis relies on manually authoring domain knowledge in some specific formal representation language, which is expensive to generate. In this dissertation we explore how to automatically extract narrative information from unannotated natural language text, how to evaluate the extraction process, how to improve the extraction process, and how to use the extracted information in story generation applications. As our application domain, we use Vladimir Propp's narrative theory and the corresponding Russian and Slavic folktales as our corpus. Our hypothesis is that incorporating narrative-level domain knowledge (i.e., Proppian theory) to core natural language processing (NLP) and information extraction can improve the performance of tasks (such as coreference resolution), and the extracted narrative information. We devised a non-linear information extraction pipeline framework which we implemented in Voz, our narrative information extraction system. Finally, we studied how to map the output of Voz to an intermediate computational narrative model and use it as input for an existing story generation system, thus further connecting existing work in NLP and computational narrative. As far as we know, it is the first end-to-end computational narrative system that can automatically process a corpus of unannotated natural language stories, extract explicit domain knowledge from them, and use it to generate new stories. Our user study results show that specific error introduced during the information extraction process can be mitigated downstream and have virtually no effect on the perceived quality of the generated stories compared to generating stories using handcrafted domain knowledge.Ph.D., Computer Science -- Drexel University, 201
Recommended from our members
Understanding Semantic Implicit Learning through distributional linguistic patterns: A computational perspective
The research presented in this PhD dissertation provides a computational perspective on Semantic Implicit Learning (SIL). It puts forward the idea that SIL does not depend on semantic knowledge as classically conceived but upon semantic-like knowledge gained through distributional analysis of massive linguistic input. Using methods borrowed from the machine learning and artificial intelligence literature, we construct computational models, which can simulate the performance observed during behavioural tasks of semantic implicit learning in a human-like way. We link this methodology to the current literature on implicit learning, arguing that this behaviour is a necessary by-product of efficient language processing.
Chapter 1 introduces the computational problem posed by implicit learning in general, and semantic implicit learning, in particular, as well as the computational framework, used to tackle them.
Chapter 2 introduces distributional semantics models as a way to learn semantic-like representations from exposure to linguistic input.
Chapter 3 reports two studies on large datasets of semantic priming which seek to identify the computational model of semantic knowledge that best fits the data under conditions that resemble SIL tasks. We find that a model which acquires semantic-like knowledge gained through distributional analysis of massive linguistic input provides the best fit to the data.
Chapter 4 generalises the results of the previous two studies by looking at the performance of the same models in languages other than English.
Chapter 5 applies the results of the two previous Chapters on eight datasets of semantic implicit learning. Crucially, these datasets use various semantic manipulations and speakers of different L1s enabling us to test the predictions of different models of semantics.
Chapter 6 examines more closely two assumptions which we have taken for granted throughout this thesis. Firstly, we test whether a simpler model based on phonological information can explain the generalisation patterns observed in the tasks. Secondly, we examine whether our definition of the computational problem in Chapter 5 is reasonable.
Chapter 7 summarises and discusses the implications for implicit language learning and computational models of cognition. Furthermore, we offer one more study that seeks to bridge the literature on distributional models of semantics to `deeper' models of semantics by learning semantic relations.
There are two main contributions of this dissertation to the general field of implicit learning research. Firstly, we highlight the superiority of distributional models of semantics in modelling unconscious semantic knowledge. Secondly, we question whether `deep' semantic knowledge is needed to achieve above chance performance in SIIL tasks. We show how a simple model that learns through distributional analysis of the patterns found in the linguistic input can match the behavioural results in different languages. Furthermore, we link these models to more general problems faced in psycholinguistics such as language processing and learning of semantic relations.Alexandros Onassis Foundatio
Recent Advances in Social Data and Artificial Intelligence 2019
The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace