603,032 research outputs found
Uniform Complexity for Text Generation
Large pre-trained language models have shown promising results in a wide
array of tasks such as narrative generation, question answering, and machine
translation. Likewise, the current trend in literature has deeply focused on
controlling salient properties of generated texts including sentiment, topic,
and coherence to produce more human-like outputs. In this work, we introduce
Uniform Complexity for Text Generation or UCTG which serves as a challenge to
make existing models generate uniformly complex text with respect to inputs or
prompts used. For example, if the reading level of an input text prompt is
appropriate for low-leveled learners (ex. A2 in the CEFR), then the generated
text by an NLG system should also assume this particular level for increased
readability. In a controlled narrative generation task, we surveyed over 160
linguistic and cognitively-motivated features for evaluating text readability
and found out that GPT-2 models and even humans struggle in preserving the
linguistic complexity of input prompts used. Ultimately, we lay down potential
methods and approaches which can be incorporated into the general framework of
steering language models towards addressing this important challenge
Foreword to the Special Issue: "Towards the Multilingual Web of Data"
We are pleased to introduce this special issue on the topic of “Towards theMultilingualWeb of Data”, which we feel is a timely and valuable topic in our increasingly multilingual and interconnected world. TheWeb of Data has increasingly become a space where concepts are described not only with logic and ontologies but also with linguistic information in the form of multilingual lexicons, terminologies and thesauri. In particular, this has led to the creation of a growing cloud of linguistic linked open data, which bridges the world of ontologies with dictionaries, corpora and other linguistic resources. This raises several challenges, such as ontology localization, cross-lingual question answering, cross-lingual ontology and data matching, representation of lexical information on theWeb of Data, etc.
Furthermore, Natural Language Processing (NLP) and machine learning for linked data can benefit from exploiting multilingual language resources, such as annotated corpora, wordnets, bilingual dictionaries, etc., if they are themselves formally represented and linked by following the linked data principles. A critical mass of language resources as linked data on the Web are leading to a new generation of linked data-aware NLP techniques and tools which, in turn, will serve as basis for a richer, multilingualWeb..
ZYN: Zero-Shot Reward Models with Yes-No Questions
In this work, we address the problem of directing the text generations of a
LLM towards a desired behavior, aligning the generated text with the
preferences of the human operator. We propose using another language model as a
critic, reward model in a zero-shot way thanks to the prompt of a Yes-No
question that represents the user preferences, without requiring further
labeled data. This zero-shot reward model provides the learning signal to
further fine-tune the base LLM using reinforcement learning, as in RLAIF; yet
our approach is also compatible in other contexts such as quality-diversity
search. Extensive evidence of the capabilities of the proposed ZYN framework is
provided through experiments in different domains related to text generation,
including detoxification; optimizing sentiment of movie reviews, or any other
attribute; steering the opinion about a particular topic the model may have;
and personalizing prompt generators for text-to-image tasks. Code to be
released at \url{https://github.com/vicgalle/zero-shot-reward-models/}
Social media in the english classroom: a study on the use of whatsapp messenger by english teaching training program students of Universidad Andrés Bello Casona de Las Condes campus
Tesis (Profesor de Inglés para la Enseñanza Básica y Media y al grado académico de Licenciado en Educación)The reason behind the use of WhatsApp Messenger (WM) by the English Teaching Training Program (ETTP) students and its possible effects on their engagement is a problem that has not been addressed in the Chilean context. The present study was designed to fill this gap. The purpose of this study was to examine the dynamics of the English class regarding the use of mobile devices. Moreover, this study aimed at examining the reasons behind the use of WM by ETTP students of UNAB Casona Las Condes Campus and its possible effects on their engagement in the English class. The method used in this investigation followed the characteristics of a sequential explanatory design. The results were obtained through two observations, a questionnaire, and a focus group. This research study concluded that the use of smartphones and specifically WM has grown exponentially as it is constantly affecting our daily routine and habits, and also what happens inside the classroom. The results revealed there were several themes attributed to disengagement that might trigger students to use WM in the English class, such as boredom, short attention span, and demotivation.Las razones de los estudiantes de Pedagogía en Inglés para usar WhatsApp Messenger (WM) y sus posibles efectos sobre el involucramiento que estos tienen en las clases de inglés es un problema que aún no ha sido tratado en el contexto chileno. El presente estudio fue diseñado para suplir esta falencia. El propósito de esta investigación fue examinar las dinámicas de la clase de inglés en relación con el uso de dispositivos móviles. Además, este estudio tenía el propósito de examinar las razones de los estudiantes de Pedagogía en Inglés de UNAB Campus Casona de Las Condes para usar WM y los posibles efectos que su involucramiento pudiera tener en la sala de inglés. El método usado en esta investigación siguió las características de un diseño secuencial explanatorio. Los resultados se obtuvieron a través de dos observaciones, un cuestionario y un grupo focal. Este estudio de investigación nos permitió concluir que el uso de smartphones y específicamente el uso de WM han crecido de forma exponencial de manera que este afecta constantemente nuestras rutinas diarias y hábitos. Los resultados revelaron que existen varios temas que se pueden atribuir al desenganche y que pueden causar que los estudiantes usen WM en la clase de inglés, como el aburrimiento, el corto periodo de concentración y la desmotivación
Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start
Every day, thousands of users sign up as new Wikipedia contributors. Once
joined, these users have to decide which articles to contribute to, which users
to seek out and learn from or collaborate with, etc. Any such task is a hard
and potentially frustrating one given the sheer size of Wikipedia. Supporting
newcomers in their first steps by recommending articles they would enjoy
editing or editors they would enjoy collaborating with is thus a promising
route toward converting them into long-term contributors. Standard recommender
systems, however, rely on users' histories of previous interactions with the
platform. As such, these systems cannot make high-quality recommendations to
newcomers without any previous interactions -- the so-called cold-start
problem. The present paper addresses the cold-start problem on Wikipedia by
developing a method for automatically building short questionnaires that, when
completed by a newly registered Wikipedia user, can be used for a variety of
purposes, including article recommendations that can help new editors get
started. Our questionnaires are constructed based on the text of Wikipedia
articles as well as the history of contributions by the already onboarded
Wikipedia editors. We assess the quality of our questionnaire-based
recommendations in an offline evaluation using historical data, as well as an
online evaluation with hundreds of real Wikipedia newcomers, concluding that
our method provides cohesive, human-readable questions that perform well
against several baselines. By addressing the cold-start problem, this work can
help with the sustainable growth and maintenance of Wikipedia's diverse editor
community.Comment: Accepted at the 13th International AAAI Conference on Web and Social
Media (ICWSM-2019
- …