603,032 research outputs found

    Uniform Complexity for Text Generation

    Get PDF
    Large pre-trained language models have shown promising results in a wide array of tasks such as narrative generation, question answering, and machine translation. Likewise, the current trend in literature has deeply focused on controlling salient properties of generated texts including sentiment, topic, and coherence to produce more human-like outputs. In this work, we introduce Uniform Complexity for Text Generation or UCTG which serves as a challenge to make existing models generate uniformly complex text with respect to inputs or prompts used. For example, if the reading level of an input text prompt is appropriate for low-leveled learners (ex. A2 in the CEFR), then the generated text by an NLG system should also assume this particular level for increased readability. In a controlled narrative generation task, we surveyed over 160 linguistic and cognitively-motivated features for evaluating text readability and found out that GPT-2 models and even humans struggle in preserving the linguistic complexity of input prompts used. Ultimately, we lay down potential methods and approaches which can be incorporated into the general framework of steering language models towards addressing this important challenge

    Foreword to the Special Issue: "Towards the Multilingual Web of Data"

    Get PDF
    We are pleased to introduce this special issue on the topic of “Towards theMultilingualWeb of Data”, which we feel is a timely and valuable topic in our increasingly multilingual and interconnected world. TheWeb of Data has increasingly become a space where concepts are described not only with logic and ontologies but also with linguistic information in the form of multilingual lexicons, terminologies and thesauri. In particular, this has led to the creation of a growing cloud of linguistic linked open data, which bridges the world of ontologies with dictionaries, corpora and other linguistic resources. This raises several challenges, such as ontology localization, cross-lingual question answering, cross-lingual ontology and data matching, representation of lexical information on theWeb of Data, etc. Furthermore, Natural Language Processing (NLP) and machine learning for linked data can benefit from exploiting multilingual language resources, such as annotated corpora, wordnets, bilingual dictionaries, etc., if they are themselves formally represented and linked by following the linked data principles. A critical mass of language resources as linked data on the Web are leading to a new generation of linked data-aware NLP techniques and tools which, in turn, will serve as basis for a richer, multilingualWeb..

    ZYN: Zero-Shot Reward Models with Yes-No Questions

    Full text link
    In this work, we address the problem of directing the text generations of a LLM towards a desired behavior, aligning the generated text with the preferences of the human operator. We propose using another language model as a critic, reward model in a zero-shot way thanks to the prompt of a Yes-No question that represents the user preferences, without requiring further labeled data. This zero-shot reward model provides the learning signal to further fine-tune the base LLM using reinforcement learning, as in RLAIF; yet our approach is also compatible in other contexts such as quality-diversity search. Extensive evidence of the capabilities of the proposed ZYN framework is provided through experiments in different domains related to text generation, including detoxification; optimizing sentiment of movie reviews, or any other attribute; steering the opinion about a particular topic the model may have; and personalizing prompt generators for text-to-image tasks. Code to be released at \url{https://github.com/vicgalle/zero-shot-reward-models/}

    Social media in the english classroom: a study on the use of whatsapp messenger by english teaching training program students of Universidad Andrés Bello Casona de Las Condes campus

    Get PDF
    Tesis (Profesor de Inglés para la Enseñanza Básica y Media y al grado académico de Licenciado en Educación)The reason behind the use of WhatsApp Messenger (WM) by the English Teaching Training Program (ETTP) students and its possible effects on their engagement is a problem that has not been addressed in the Chilean context. The present study was designed to fill this gap. The purpose of this study was to examine the dynamics of the English class regarding the use of mobile devices. Moreover, this study aimed at examining the reasons behind the use of WM by ETTP students of UNAB Casona Las Condes Campus and its possible effects on their engagement in the English class. The method used in this investigation followed the characteristics of a sequential explanatory design. The results were obtained through two observations, a questionnaire, and a focus group. This research study concluded that the use of smartphones and specifically WM has grown exponentially as it is constantly affecting our daily routine and habits, and also what happens inside the classroom. The results revealed there were several themes attributed to disengagement that might trigger students to use WM in the English class, such as boredom, short attention span, and demotivation.Las razones de los estudiantes de Pedagogía en Inglés para usar WhatsApp Messenger (WM) y sus posibles efectos sobre el involucramiento que estos tienen en las clases de inglés es un problema que aún no ha sido tratado en el contexto chileno. El presente estudio fue diseñado para suplir esta falencia. El propósito de esta investigación fue examinar las dinámicas de la clase de inglés en relación con el uso de dispositivos móviles. Además, este estudio tenía el propósito de examinar las razones de los estudiantes de Pedagogía en Inglés de UNAB Campus Casona de Las Condes para usar WM y los posibles efectos que su involucramiento pudiera tener en la sala de inglés. El método usado en esta investigación siguió las características de un diseño secuencial explanatorio. Los resultados se obtuvieron a través de dos observaciones, un cuestionario y un grupo focal. Este estudio de investigación nos permitió concluir que el uso de smartphones y específicamente el uso de WM han crecido de forma exponencial de manera que este afecta constantemente nuestras rutinas diarias y hábitos. Los resultados revelaron que existen varios temas que se pueden atribuir al desenganche y que pueden causar que los estudiantes usen WM en la clase de inglés, como el aburrimiento, el corto periodo de concentración y la desmotivación

    Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start

    Full text link
    Every day, thousands of users sign up as new Wikipedia contributors. Once joined, these users have to decide which articles to contribute to, which users to seek out and learn from or collaborate with, etc. Any such task is a hard and potentially frustrating one given the sheer size of Wikipedia. Supporting newcomers in their first steps by recommending articles they would enjoy editing or editors they would enjoy collaborating with is thus a promising route toward converting them into long-term contributors. Standard recommender systems, however, rely on users' histories of previous interactions with the platform. As such, these systems cannot make high-quality recommendations to newcomers without any previous interactions -- the so-called cold-start problem. The present paper addresses the cold-start problem on Wikipedia by developing a method for automatically building short questionnaires that, when completed by a newly registered Wikipedia user, can be used for a variety of purposes, including article recommendations that can help new editors get started. Our questionnaires are constructed based on the text of Wikipedia articles as well as the history of contributions by the already onboarded Wikipedia editors. We assess the quality of our questionnaire-based recommendations in an offline evaluation using historical data, as well as an online evaluation with hundreds of real Wikipedia newcomers, concluding that our method provides cohesive, human-readable questions that perform well against several baselines. By addressing the cold-start problem, this work can help with the sustainable growth and maintenance of Wikipedia's diverse editor community.Comment: Accepted at the 13th International AAAI Conference on Web and Social Media (ICWSM-2019
    corecore