56 research outputs found

    A Call for Standardization and Validation of Text Style Transfer Evaluation

    Full text link
    Text Style Transfer (TST) evaluation is, in practice, inconsistent. Therefore, we conduct a meta-analysis on human and automated TST evaluation and experimentation that thoroughly examines existing literature in the field. The meta-analysis reveals a substantial standardization gap in human and automated evaluation. In addition, we also find a validation gap: only few automated metrics have been validated using human experiments. To this end, we thoroughly scrutinize both the standardization and validation gap and reveal the resulting pitfalls. This work also paves the way to close the standardization and validation gap in TST evaluation by calling out requirements to be met by future research.Comment: Accepted to Findings of ACL 202

    Deep Learning for Text Style Transfer: A Survey

    Full text link
    Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of this task. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202

    Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

    Get PDF

    MOG 2007:Workshop on Multimodal Output Generation: CTIT Proceedings

    Get PDF
    This volume brings together presents a wide variety of work offering different perspectives on multimodal generation. Two different strands of work can be distinguished: half of the gathered papers present current work on embodied conversational agents (ECA’s), while the other half presents current work on multimedia applications. Two general research questions are shared by all: what output modalities are most suitable in which situation, and how should different output modalities be combined

    Reference Production as Search:The Impact of Domain Size on the Production of Distinguishing Descriptions

    Get PDF
    When producing a description of a target referent in a visual context, speakers need to choose a set of properties that distinguish it from its distractors. Computational models of language production/generation usually model this as a search process and predict that the time taken will increase both with the number of distractors in a scene and with the number of properties required to distinguish the target. These predictions are reminiscent of classic ndings in visual search; however, unlike models of reference production, visual search models also predict that search can become very e cient under certain conditions, something that reference production models do not consider. This paper investigates the predictions of these models empirically. In two experiments, we show that the time taken to plan a referring expression { as re ected by speech onset latencies { is in uenced by distractor set size and by the number of properties required, but this crucially depends on the discriminability of the properties under consideration. We discuss the implications for current models of reference production and recent work on the role of salience in visual search.peer-reviewe

    D4.1. Technologies and tools for corpus creation, normalization and annotation

    Get PDF
    The objectives of the Corpus Acquisition and Annotation (CAA) subsystem are the acquisition and processing of monolingual and bilingual language resources (LRs) required in the PANACEA context. Therefore, the CAA subsystem includes: i) a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web, ii) a component for cleanup and normalization (CNC) of these data and iii) a text processing component (TPC) which consists of NLP tools including modules for sentence splitting, POS tagging, lemmatization, parsing and named entity recognition

    Generación de expresiones referenciales bajo incertidumbre con teoría de modelos

    Get PDF
    Tesis (Doctor en Ciencias de la Computación)--Universidad Nacional de Córdoba, Facultad de Matemática, Astronomía, Física y Computación, 2016.En esta tesis investigamos la generación automática de rankings de expresiones referenciales en contextos con incertidumbre. Las posibles aplicaciones de la generación de expresiones referenciales que deben referirse al mundo real (software para robots, sistemas gps, etc.) sufren de incertidumbre por datos ruidosos de sensores y modelos incompletos de la realidad. Extendemos técnicas y algoritmos de teoría de modelos y simulaciones integrando una distribución finita de probabilidades que representa esta incertidumbre. El objetivo es generar un ranking de las expresiones referenciales ordenado por la probabilidad de ser correctamente interpretada en el contexto. En primer lugar, se desarrollaron técnicas y algoritmos de generación de expresiones referenciales que extienden algoritmos clásicos de minimización de autómatas. Los algoritmos de minimización se aplicaron a la caracterización de modelos de primer orden. Dichos algoritmos fueron extendidos usando probabilidades aprendidas de corpora con técnicas de aprendizaje automático. Los algoritmos resultantes fueron evaluados usando técnicas automáticas y evaluaciones de jueces humanos sobre datos de benchmarks del área. Finalmente se recolectó un nuevo corpus de expresiones referenciales de puntos de interés en mapas de ciudades con distintos niveles de zoom. Se evaluó el desempeño del algoritmo en este corpus relevante a aplicaciones sobre mapas del mundo real.In this thesis we investigate the automatic generation of referring expression rankings in uncertain contexts. The potential applications of automatic generation of referring expressions that need to refer to the real world (e.g. robot software, gps systems, etc) suffer from uncertainty due to noisy sensor data and incomplete models. We extend techniques and algorithms from model theory with a finite probability distribution that represents this uncertainty. Our goal is to generate a ranking of referring expressions ordered by the probability of being interpreted successfully. First, we developed techniques and algorithms for generating referring expressions that extend classical algorithms for automata minimization applied to first order model characterization. Such algorithms were extended using probabilities learned from corpora using machine learning techniques. The resulting algorithms were evaluated using automatic metrics and human judgements with respect to benchmarks from the area. Finally, we collected a new corpus of referring expressions of interest points in city maps with different zoom levels. The algorithms were evaluated on this corpus which is relevant to applications with maps of the real world
    corecore