4 research outputs found
Reflexiones sobre los alcances de los modelos computacionales para la generación de narrativas
La generación automática de narrativas es un campo apasionante y complejo. Existen sistemas que desarrollan reportes sobre el clima, escriben cuentos y poseía, entre otras funciones. En este artículo analizo una de sus principales características, el uso de estructuras narrativas predefinidas, la cual empleo para clasificar los programas existentes en dos grandes grupos, compararlos, y discutir los alcances que cada uno de ellos tiene. Concluyo exponiendo las razones por las cuales los estudiosos de las humanidades digitales deben participar intensamente en este campo de estudio.The automatic generation of narratives is a fascinating and complex field. There are systems that give reports on climate, write short stories and poetry, among other functions. This article analyzes one of their principal features: the use of preset narrative structures, which I use to classify existing programs in two large groups in order to compare them and discuss their limits. By means of conclusion, I will discuss the reasons why scholars in the humanities should participate actively in this field of study
Recommended from our members
Data-Driven Solutions to Bottlenecks in Natural Language Generation
Concept-to-text generation suffers from what can be called generation bottlenecks - aspects of the generated text which should change for different subject domains, and which are usually hard to obtain or require manual work. Some examples are domain-specific content, a type system, a dictionary, discourse style and lexical style. These bottlenecks have stifled attempts to create generation systems that are generic, or at least apply to a wide range of domains in non-trivial applications.
This thesis is comprised of two parts. In the first, we propose data-driven solutions that automate obtaining the information and models required to solve some of these bottlenecks. Specifically, we present an approach to mining domain-specific paraphrasal templates from a simple text corpus; an approach to extracting a domain-specific taxonomic thesaurus from Wikipedia; and a novel document planning model which determines both ordering and discourse relations, and which can be extracted from a domain corpus. We evaluate each solution individually and independently from its ultimate use in generation, and show significant improvements in each.
In the second part of the thesis, we describe a framework for creating generation systems that rely on these solutions, as well as on hybrid concept-to-text and text-to-text generation, and which can be automatically adapted to any domain using only a domain-specific corpus. We illustrate the breadth of applications that this framework applies to with three examples: biography generation and company description generation, which we use to evaluate the framework itself and the contribution of our solutions; and justification of machine learning predictions, a novel application which we evaluate in a task-based study to show its importance to users
Recommended from our members
A meta-information structure for representing arguments in science text
The research for this thesis has been concerned with defining and demonstrating the existence of certain semantic elements in English natural language science text which can be called metainformation. Meta-information is described as being the organisational-, rather than the conceptual properties of an author's 'message' in text. Conceptual information is that subject-related output from a document which readers assimilate or synthesise with their current state-of-knowledge. Meta-information reflects the organisation or structural format used by an author to present conceptual information for transfer from text to readers. The example used here to demonstrate the existence of meta-information, is a format for the presentation of empirical argument in science text. At its most simplep a meta-informational element could be a report section-heading like, INTRODUCTION, which describes (we assume), the contents of the subsequent text. At a lower level of analysis the phrase, 'This paper describes contains some semantic inference that the complete statement is one of an introductory nature; thereforep such a statement could be labelled as one of INTRODUCTION for meta-informational purposes. A 'grammar' or set of meta-informational elements, has been developed as a means of identifying certain semantic aspects of text. This grammar is based on some experimental evidence and the consensus view of readers and writers of science text who produced what has been called a conventional format for empirical argument presentation. An initial set of rules for implementing this grammar have also been developed. The rules have been tested for replicability with positive results. Although analysis of full text hasshown deviation from a 'conventional argument structure readers' summaries of the same text conform to this structure. Thus, a model of the phenomenQn of information transfer from text to readers, which includes a structural transformation process based on the experimental results, has been built. A computer simulation is given to demonstrate the model in an inter-active program-user system designed to produce summaries of whole text. The thesis is that evidence exists for the presence of meta-information in science text and that if a grammar appropriate to the kind of output information required by users is built, highly structured text could be produced so that the process of information transfer is optimised