5 research outputs found

    Comparing the performance of two TAG-based surface realisers using controlled grammar traversal

    Get PDF
    International audienceWe present GENSEM, a tool for generating input semantic representations for two sentence generators based on the same reversible Tree Adjoining Grammar. We then show how GENSEM can be used to produced large and controlled benchmarks and test the relative performance of these generators

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Error Mining on Dependency Trees

    Get PDF
    International audienceIn recent years, error mining approaches were developed to help identify the most likely sources of parsing failures in parsing systems using handcrafted grammars and lexicons. However the techniques they use to enumerate and count n-grams builds on the sequential nature of a text corpus and do not easily extend to structured data. In this paper, we propose an algorithm for mining trees and apply it to detect the most likely sources of generation failure. We show that this tree mining algorithm permits identifying not only errors in the generation system (grammar, lexicon) but also mismatches between the structures contained in the input and the input structures expected by our generator as well as a few idiosyncrasies/error in the input data

    Using Regular Tree Grammars to enhance Sentence Realisation

    Get PDF
    International audienceFeature-based regular tree grammars (FRTG) can be used to generate the derivation trees of a feature-based tree adjoining grammar (FTAG). We make use of this fact to specify and implement both an FTAG-based sentence realiser and a benchmark generator for this realiser. We argue furthermore that the FRTG encoding enables us to improve on other proposals based on a grammar of TAG derivation trees in several ways. It preserves the compositional semantics that can be encoded in feature-based TAGs; it increases efficiency and restricts overgeneration; and it provides a uniform resource for generation, benchmark construction, and parsing

    Natural language generation in the LOLITA system an engineering approach

    Get PDF
    Natural Language Generation (NLG) is the automatic generation of Natural Language (NL) by computer in order to meet communicative goals. One aim of NL processing (NLP) is to allow more natural communication with a computer and, since communication is a two-way process, a NL system should be able to produce as well as interpret NL text. This research concerns the design and implementation of a NLG module for the LOLITA system. LOLITA (Large scale, Object-based, Linguistic Interactor, Translator and Analyser) is a general purpose base NLP system which performs core NLP tasks and upon which prototype NL applications have been built. As part of this encompassing project, this research shares some of its properties and methodological assumptions: the LOLITA generator has been built following Natural Language Engineering principles uses LOLITA's SemNet representation as input and is implemented in the functional programming language Haskell. As in other generation systems the adopted solution utilises a two component architecture. However, in order to avoid problems which occur at the interface between traditional planning and realisation modules (known as the generation gap) the distribution of tasks between the planner and plan-realiser is different: the plan-realiser, in the absence of detailed planning instructions, must perform some tasks (such as the selection and ordering of content) which are more traditionally performed by a planner. This work largely concerns the development of the plan- realiser and its interface with the planner. Another aspect of the solution is the use of Abstract Transformations which act on the SemNet input before realisation leading to an increased ability for creating paraphrases. The research has lead to a practical working solution which has greatly increased the power of the LOLITA system. The research also investigates how NLG systems can be evaluated and the advantages and disadvantages of using a functional language for the generation task
    corecore