188 research outputs found

    The TUNA-REG challenge 2009 : overview and evaluation results

    Get PDF
    The TUNA-REG’09 Challenge was one of the shared-task evaluation competitions at Generation Challenges 2009. TUNAREG’ 09 used data from the TUNA Corpus of paired representations of entities and human-authored referring expressions. The shared task was to create systems that generate referring expressions for entities given representations of sets of entities and their properties. Four teams submitted six systems to TUNAREG’ 09. We evaluated the six systems and two sets of human-authored referring expressions using several automatic intrinsic measures, a human-assessed intrinsic evaluation and a human task performance experiment. This report describes the TUNAREG task and the evaluation methods used, and presents the evaluation results.peer-reviewe

    Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

    Get PDF

    Generación de expresiones referenciales bajo incertidumbre con teoría de modelos

    Get PDF
    Tesis (Doctor en Ciencias de la Computación)--Universidad Nacional de Córdoba, Facultad de Matemática, Astronomía, Física y Computación, 2016.En esta tesis investigamos la generación automática de rankings de expresiones referenciales en contextos con incertidumbre. Las posibles aplicaciones de la generación de expresiones referenciales que deben referirse al mundo real (software para robots, sistemas gps, etc.) sufren de incertidumbre por datos ruidosos de sensores y modelos incompletos de la realidad. Extendemos técnicas y algoritmos de teoría de modelos y simulaciones integrando una distribución finita de probabilidades que representa esta incertidumbre. El objetivo es generar un ranking de las expresiones referenciales ordenado por la probabilidad de ser correctamente interpretada en el contexto. En primer lugar, se desarrollaron técnicas y algoritmos de generación de expresiones referenciales que extienden algoritmos clásicos de minimización de autómatas. Los algoritmos de minimización se aplicaron a la caracterización de modelos de primer orden. Dichos algoritmos fueron extendidos usando probabilidades aprendidas de corpora con técnicas de aprendizaje automático. Los algoritmos resultantes fueron evaluados usando técnicas automáticas y evaluaciones de jueces humanos sobre datos de benchmarks del área. Finalmente se recolectó un nuevo corpus de expresiones referenciales de puntos de interés en mapas de ciudades con distintos niveles de zoom. Se evaluó el desempeño del algoritmo en este corpus relevante a aplicaciones sobre mapas del mundo real.In this thesis we investigate the automatic generation of referring expression rankings in uncertain contexts. The potential applications of automatic generation of referring expressions that need to refer to the real world (e.g. robot software, gps systems, etc) suffer from uncertainty due to noisy sensor data and incomplete models. We extend techniques and algorithms from model theory with a finite probability distribution that represents this uncertainty. Our goal is to generate a ranking of referring expressions ordered by the probability of being interpreted successfully. First, we developed techniques and algorithms for generating referring expressions that extend classical algorithms for automata minimization applied to first order model characterization. Such algorithms were extended using probabilities learned from corpora using machine learning techniques. The resulting algorithms were evaluated using automatic metrics and human judgements with respect to benchmarks from the area. Finally, we collected a new corpus of referring expressions of interest points in city maps with different zoom levels. The algorithms were evaluated on this corpus which is relevant to applications with maps of the real world

    usp-each frequency-based greedy attribute selection for referring expressions generation

    No full text
    Both greedy and domain-oriented REG algorithms have significant strengths but tend to perform poorly according to humanlikeness criteria as measured by, e.g., Dice scores. In this work we describe an attempt to combine both perspectives into a single attribute selection strategy to be used as part of the Dale & Reiter Incremental algorithm in the REG Challenge 2008, and the results in both Furniture and People domains.

    RFID Technology in Intelligent Tracking Systems in Construction Waste Logistics Using Optimisation Techniques

    Get PDF
    Construction waste disposal is an urgent issue for protecting our environment. This paper proposes a waste management system and illustrates the work process using plasterboard waste as an example, which creates a hazardous gas when land filled with household waste, and for which the recycling rate is less than 10% in the UK. The proposed system integrates RFID technology, Rule-Based Reasoning, Ant Colony optimization and knowledge technology for auditing and tracking plasterboard waste, guiding the operation staff, arranging vehicles, schedule planning, and also provides evidence to verify its disposal. It h relies on RFID equipment for collecting logistical data and uses digital imaging equipment to give further evidence; the reasoning core in the third layer is responsible for generating schedules and route plans and guidance, and the last layer delivers the result to inform users. The paper firstly introduces the current plasterboard disposal situation and addresses the logistical problem that is now the main barrier to a higher recycling rate, followed by discussion of the proposed system in terms of both system level structure and process structure. And finally, an example scenario will be given to illustrate the system’s utilization

    Sociolinguistic study of the Moroccan community of Edinburgh

    Get PDF

    Towards generic relation extraction

    Get PDF
    A vast amount of usable electronic data is in the form of unstructured text. The relation extraction task aims to identify useful information in text (e.g., PersonW works for OrganisationX, GeneY encodes ProteinZ) and recode it in a format such as a relational database that can be more effectively used for querying and automated reasoning. However, adapting conventional relation extraction systems to new domains or tasks requires significant effort from annotators and developers. Furthermore, previous adaptation approaches based on bootstrapping start from example instances of the target relations, thus requiring that the correct relation type schema be known in advance. Generic relation extraction (GRE) addresses the adaptation problem by applying generic techniques that achieve comparable accuracy when transferred, without modification of model parameters, across domains and tasks. Previous work on GRE has relied extensively on various lexical and shallow syntactic indicators. I present new state-of-the-art models for GRE that incorporate governordependency information. I also introduce a dimensionality reduction step into the GRE relation characterisation sub-task, which serves to capture latent semantic information and leads to significant improvements over an unreduced model. Comparison of dimensionality reduction techniques suggests that latent Dirichlet allocation (LDA) – a probabilistic generative approach – successfully incorporates a larger and more interdependent feature set than a model based on singular value decomposition (SVD) and performs as well as or better than SVD on all experimental settings. Finally, I will introduce multi-document summarisation as an extrinsic test bed for GRE and present results which demonstrate that the relative performance of GRE models is consistent across tasks and that the GRE-based representation leads to significant improvements over a standard baseline from the literature. Taken together, the experimental results 1) show that GRE can be improved using dependency parsing and dimensionality reduction, 2) demonstrate the utility of GRE for the content selection step of extractive summarisation and 3) validate the GRE claim of modification-free adaptation for the first time with respect to both domain and task. This thesis also introduces data sets derived from publicly available corpora for the purpose of rigorous intrinsic evaluation in the news and biomedical domains
    corecore