7 research outputs found

    Imitation learning for language generation from unaligned data

    Get PDF
    Natural language generation (NLG) is the task of generating natural language from a meaning representation. Rule-based approaches require domain-specific and manually constructed linguistic resources, while most corpus based approaches rely on aligned training data and/or phrase templates. The latter are needed to restrict the search space for the structured prediction task defined by the unaligned datasets. In this work we propose the use of imitation learning for structured prediction which learns an incremental model that handles the large search space while avoiding explicitly enumerating it. We adapted the Locally Optimal Learning to Search (Chang et al., 2015) framework which allows us to train against non-decomposable loss functions such as the BLEU or ROUGE scores while not assuming gold standard alignments. We evaluate our approach on three datasets using both automatic measures and human judgements and achieve results comparable to the state-of-the-art approaches developed for each of them. Furthermore, we performed an analysis of the datasets which examines common issues with NLG evaluation

    Data-driven Natural Language Generation: Paving the Road to Success

    Full text link
    We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora. We address the first problem by thoroughly analysing current evaluation metrics and motivating the need for a new, more reliable metric. The second problem is addressed by presenting a novel framework for developing and evaluating a high quality corpus for NLG training.Comment: WiNLP workshop at ACL 201

    Referenceless Quality Estimation for Natural Language Generation

    Full text link
    Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output. In this paper, we propose a referenceless quality estimation (QE) approach based on recurrent neural networks, which predicts a quality score for a NLG system output by comparing it to the source meaning representation only. Our method outperforms traditional metrics and a constant baseline in most respects; we also show that synthetic data helps to increase correlation results by 21% compared to the base system. Our results are comparable to results obtained in similar QE tasks despite the more challenging setting.Comment: Accepted as a regular paper to 1st Workshop on Learning to Generate Natural Language (LGNL), Sydney, 10 August 201

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge

    Full text link
    This paper provides a comprehensive analysis of the first shared task on End-to-End Natural Language Generation (NLG) and identifies avenues for future research based on the results. This shared task aimed to assess whether recent end-to-end NLG systems can generate more complex output by learning from datasets containing higher lexical richness, syntactic complexity and diverse discourse phenomena. Introducing novel automatic and human metrics, we compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates. Seq2seq-based systems have demonstrated a great potential for NLG in the challenge. We find that seq2seq systems generally score high in terms of word-overlap metrics and human evaluations of naturalness -- with the winning SLUG system (Juraska et al., 2018) being seq2seq-based. However, vanilla seq2seq models often fail to correctly express a given meaning representation if they lack a strong semantic control mechanism applied during decoding. Moreover, seq2seq models can be outperformed by hand-engineered systems in terms of overall quality, as well as complexity, length and diversity of outputs. This research has influenced, inspired and motivated a number of recent studies outwith the original competition, which we also summarise as part of this paper.Comment: Computer Speech and Language, final accepted manuscript (in press

    Effects of Exposure to L1 Translation in Vocabulary Acquisition in English as a Foreign Language with College Students

    Full text link
    [ES] La adquisición de vocabulario es uno de los principales desafíos para los estudiantes de idiomas y la falta de un vocabulario adecuado es el primer impedimento para una comunicación exitosa. A través de una revisión de la literatura sobre la enseñanza y el aprendizaje de vocabulario se identificó una brecha importante; la mayor parte de la investigación se lleva a cabo en condiciones controladas. Existe la necesidad de comprender la influencia de la enseñanza del vocabulario en el entorno real del aula. Este estudio examina específicamente la influencia de metodologías de enseñanza de vocabulario en el aula. Este estudio se realizó en una universidad privada con 37 participantes en un estudio piloto y 166 en el estudio principal, ambos divididos en grupos de control y experimentales utilizando un diseño de pretest-postest para analizar la influencia de la instrucción de vocabulario explícito en las clases. El conocimiento del vocabulario se evaluó antes y después de las intervenciones con una versión adaptada de la Escala de conocimiento del vocabulario (VKS por sus siglas en inglés) (Paribakht y Wesche, 1993). Esta investigación constó de dos fases. En primer lugar, se evaluó en un estudio piloto la instrucción de vocabulario explícito a través de la exposición visual al vocabulario objetivo con traducción al español y entrada auditiva, esta etapa se centró en el primer paso para el aprendizaje de vocabulario mencionado por Nation (2013): Prestar atención a las palabras. Los resultados obtenidos del estudio piloto no presentaron diferencias significativas entre el grupo control y el experimental. Por lo tanto, se decidió incluir una actividad adicional para mejorar el aprendizaje de vocabulario. En la segunda fase, que incluyó a 166 estudiantes, empleó una actividad de vocabulario basada en la web, así como la exposición visual. Esto se introdujo para evocar el segundo paso del aprendizaje de vocabulario: Recuperación. Esta metodología brindó oportunidades para que los participantes exploraran el vocabulario con una nueva herramienta de aprendizaje; permitiendo a los estudiantes no solo notar el vocabulario clave, sino también recuperarlo. Los resultados del estudio principal fueron alentadores, el grupo experimental superó al grupo de control en la prueba posterior (p<0,001) mostrando una mejora significativa en la mayoría de las palabras. Podemos suponer que la metodología adicional incluida en el estudio principal podría ser responsable de la mejora del vocabulario. Después de la intervención, una entrevista semiestructurada con los participantes del grupo experimental obtuvo información sobre sus ideas sobre su propio aprendizaje y la metodología utilizada. Los participantes dieron una opinión positiva de las actividades basadas en la web y reconocieron la importancia del desarrollo del vocabulario en su proceso de aprendizaje de idiomas. Este estudio destaca la influencia positiva de la instrucción de vocabulario explícito en el entorno del aula de aprendizaje de inglés. La tecnología brinda oportunidades para replicar esta metodología con poca inversión de tiempo; esta puede ser una herramienta beneficiosa para profesores y estudiantes. En este sentido, al final, se discuten las implicaciones pedagógicas.[CAT] L'adquisició de vocabulari és un dels principals desafiaments per als estudiants d'idiomes i la falta d'un vocabulari adequat és el primer impediment per a una comunicació amb èxit. A través d'una revisió de la literatura sobre l'ensenyança i l'aprenentatge de vocabulari es va identificar una bretxa important; la major part de la investigació es du a terme en condicions controlades. Hi ha la necessitat de comprendre la influència de l'ensenyança del vocabulari en l'entorn real de l'aula. Aquest estudi examina específicament la influència de metodologies d'ensenyança de vocabulari en l'aula. Este estudi es va realitzar en una universitat privada amb 37 participants en un estudi pilot i 166 en l'estudi principal, ambdós dividits en grups de control i experimentals utilitzant un disseny de pretest-postest per a analitzar la influència de la instrucció de vocabulari explícit en les classes. El coneixement del vocabulari es va avaluar abans i després de les intervencions amb una versió adaptada de l'Escala de coneixement del vocabulari (VKS per les seues sigles en anglès) (Paribakht i Wesche, 1993) . Aquesta investigació va constar de dos fases. En primer lloc, es va avaluar en un estudi pilot la instrucció de vocabulari explícit a través de l'exposició visual al vocabulari objectiu amb traducció a l'espanyol i entrada auditiva, esta etapa es va centrar en el primer pas per a l'aprenentatge de vocabulari mencionat per Nation (2013) : Parar atenció a les paraules. Els resultats obtinguts de l'estudi pilot no van presentar diferències significatives entre el grup control i l'experimental. Per tant, es va decidir incloure una activitat addicional per a millorar l'aprenentatge de vocabulari. En la segona fase, que va incloure a 166 estudiants, va emprar una activitat de vocabulari basada en la web, així com l'exposició visual. Açò es va introduir per a evocar el segon pas de l'aprenentatge de vocabulari: Recuperació. Esta metodologia va brindar oportunitats perquè els participants exploraren el vocabulari amb una nova ferramenta d'aprenentatge; permetent als estudiants no sols notar el vocabulari clau, sinó també recuperar-lo. Els resultats de l'estudi principal van ser encoratjadors, el grup experimental va superar al grup de control en la prova posterior (p<0,001) mostrant una millora significativa en la majoria de les paraules. Podem suposar que la metodologia addicional inclosa en l'estudi principal podria ser responsable de la millora del vocabulari. Després de la intervenció, una entrevista semiestructurada amb els participants del grup experimental va obtindre informació sobre les seues idees sobre el seu propi aprenentatge i la metodologia utilitzada. Els participants van donar una opinió positiva de les activitats basades en la web i van reconèixer la importància del desenvolupament del vocabulari en el seu procés d'aprenentatge d'idiomes. Este estudi destaca la influència positiva de la instrucció de vocabulari explícit en l'entorn de l'aula d'aprenentatge d'anglès. La tecnologia brinda oportunitats per a replicar esta metodologia amb poca inversió de temps; esta pot ser una ferramenta beneficiosa per a professors i estudiants. En este sentit, al final, es discutixen les implicacions pedagògiques.[EN] Vocabulary acquisition is one of the major challenges for language learners and the lack of proper vocabulary is the first impediment to successful communication. A literature review of vocabulary teaching and learning identified an important gap; most research is conducted under controlled conditions. There is a necessity to understand the influence of vocabulary instruction in real classroom settings. This study specifically examines the influence of vocabulary teaching methodologies in the classroom. This study was conducted in a private university with 37 participants in a pilot study and 166 in the main study, both divided into control and experimental groups using a pretest-posttest design in order to analyse the influence of explicit vocabulary instruction in classes. Vocabulary knowledge was assessed before and after interventions with an adapted version of the Vocabulary Knowledge Scale (VKS) (Paribakht & Wesche, 1993). This research consisted of two phases. First, explicit vocabulary instruction through visual exposure to target vocabulary with Spanish translation and aural input was assessed in a pilot study. This stage focused on the first step for vocabulary learning mentioned by Nation (2013): Noticing. Results obtained from the pilot study presented no significant difference between the control and the experimental group. Therefore, it was decided to include an additional activity to enhance vocabulary learning. In the second phase, which included 166 students, employed a web-based vocabulary activity as well as the visual exposure. This was introduced to evoke the second step of vocabulary learning: Retrieval. This methodology provided opportunities for participants to explore vocabulary with a new learning tool; allowing students to not only notice target vocabulary, but also to retrieve it. The results from the main study were encouraging, the experimental group outperformed the control group in the posttest (p<0.001) showing significant improvement in most words in the experimental group. We may assume that the additional methodology included in the main study could be responsible for the vocabulary enhancement. After the intervention, a semi-structured interview with participants from the experimental group elicited information about their ideas toward their own learning and the methodology used. Participants gave a positive opinion of web-based activities and acknowledged the importance of vocabulary development in their language-learning process. This study highlights the positive influence of explicit vocabulary instruction in English Learning classroom settings. Technology provides opportunities to replicate this methodology with little time investment; a beneficial tool for teachers and students. In this sense, pedagogic implications are discussed.Palacios Vivar, C. (2022). Effects of Exposure to L1 Translation in Vocabulary Acquisition in English as a Foreign Language with College Students [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/183071TESI
    corecore