22 research outputs found

    Instance-based natural language generation

    Get PDF
    In recent years, ranking approaches to Natural Language Generation have become increasingly popular. They abandon the idea of generation as a deterministic decision卢 making process in favour of approaches that combine overgeneration with ranking at some stage in processing.In this thesis, we investigate the use of instance-based ranking methods for surface realization in Natural Language Generation. Our approach to instance-based Natural Language Generation employs two basic components: a rule system that generates a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. The instance-based ranker uses information retrieval methods to rank output candidates.Our approach is corpus-based in that it uses a treebank (a subset of the Penn Treebank II containing management succession texts) in combination with manual semantic markup to automatically produce a generation grammar. Furthermore, the corpus is also used by the instance-based ranker. The semantic annotation of a test portion of the compiled subcorpus serves as input to the generator.In this thesis, we develop an efficient search technique for identifying the optimal candidate based on the A*-algorithm, detail the annotation scheme and grammar con卢 struction algorithm and show how a Rete-based production system can be used for efficient candidate generation. Furthermore, we examine the output of the generator and discuss issues like input coverage (completeness), fluency and faithfulness that are relevant to surface generation in general

    Learning to Order Facts for Discourse Planning in Natural Language Generation

    Full text link
    This paper presents a machine learning approach to discourse planning in natural language generation. More specifically, we address the problem of learning the most natural ordering of facts in discourse plans for a specific domain. We discuss our methodology and how it was instantiated using two different machine learning algorithms. A quantitative evaluation performed in the domain of museum exhibit descriptions indicates that our approach performs significantly better than manually constructed ordering rules. Being retrainable, the resulting planners can be ported easily to other similar domains, without requiring language technology expertise.Comment: 8 pages, 4 figures, 1 tabl

    Lexical choice and conceptual perspective in the generation of plural referring expressions

    Get PDF
    A fundamental part of the process of referring to an entity is to categorise it (for instance, as the woman). Where multiple categorisations exist, this implicitly involves the adoption of a conceptual perspective. A challenge for the automatic Generation of Referring Expressions is to identify a set of referents coherently, adopting the same conceptual perspective. We describe and evaluate an algorithm to achieve this. The design of the algorithm is motivated by the results of psycholinguistic experiments.peer-reviewe

    La plataforma EDUCAGENT: agentes conversacionales inteligentes y entornos virtuales aplicados a la docencia

    Get PDF
    El desarrollo de la Web 2.0 y el gran inter茅s alcanzado por las redes sociales ha posibilitado la introducci贸n de un gran n煤mero de aplicaciones y entornos educativos que posibilitan nuevas formas de comunicaci贸n e interacci贸n entre sus usuarios. En este contexto, los mundos virtuales y los agentes conversacionales facilitan la creaci贸n de entornos educativos que intensifican la percepci贸n entre sus usuarios y que proporcionan una comunicaci贸n m谩s natural y adaptada a las caracter铆sticas y preferencias espec铆ficas de cada usuario. En este art铆culo describimos un sistema multiagente desarrollado para el apoyo a la docencia y el aprendizaje aut贸nomo de los alumnos. A trav茅s del sistema, se presenta a los alumnos casos y problemas que deben resolver, y que posibilitan adem谩s la autoevaluaci贸n de su aprendizaje, especialmente en iniciativas de tele-educaci贸n y realizaci贸n de cursos on-line. La plataforma EducAgent se ha desarrollado en la Universidad Carlos III de Madrid dentro de la Convocatoria de Apoyo a Experiencias de Innovaci贸n e Internacionalizaci贸n Docente. El objetivo principal del proyecto es la creaci贸n de un espacio virtual innovador basado en los postulados del Espacio Europeo de Educaci贸n Superior, que haga de las asignaturas y cursos on-line un espacio m谩s flexible, participativo y atractivoWith the development of so-called Web 2.0 and the great interest and extension that social networks have now reached, a large number of e-learning environments and applications that originate new forms of communication and interaction among users have been quickly introduced. Within this framework, virtual worlds and conversational agents facilitate the creation of educative applications that intensify the perception between their users and provide a more natural communication adapted to the characteristics and specific preferences of each user. In this paper, we describe a multi-agent system developed for teaching support and student鈥檚 self-learning. The main objective of the EducAgent platform is the creation of an innovative virtual space following the principles of the European Higher Education Area to make subjects and e-learning initiatives to become a more flexible, participatory and attractive space. One of the most important characteristics of the developed platform is to facilitate a more natural interaction between the system and students by means of conversational agents. We describe the main features of the EducAgent platform and its application in the new European Computer Science Degree at the Carlos III University of Madrid.Trabajo llevado a cabo dentro de la 9陋 Convocatoria de Apoyo a Experiencias de Innovaci贸n e Internacionalizaci贸n Docente de la Universidad Carlos III de Madrid y financiado parcialmente por los Proyectos CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008- 06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) y DPS2008-07029-C02-02.Publicad

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Application of fuzzy sets in data-to-text system

    Get PDF
    This PhD dissertation addresses the convergence of two distinct paradigms: fuzzy sets and natural language generation. The object of study is the integration of fuzzy set-derived techniques that model imprecision and uncertainty in human language into systems that generate textual information from numeric data, commonly known as data-to-text systems. This dissertation covers an extensive state of the art review, potential convergence points, two real data-to-text applications that integrate fuzzy sets (in the meteorology and learning analytics domains), and a model that encompasses the most relevant elements in the linguistic description of data discipline and provides a framework for building and integrating fuzzy set-based approaches into natural language generation/data-to-ext systems
    corecore