169 research outputs found

    UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt

    Full text link
    Recent research has shown that multi-task pre-training greatly improves the model's robustness and transfer ability, which is crucial for building a high-quality dialog system. However, most previous works on multi-task pre-training rely heavily on human-defined input format or prompt, which is not optimal in quality and quantity. In this work, we propose to use Task-based Automatic Prompt generation (TAP) to automatically generate high-quality prompts. Using the high-quality prompts generated, we scale the corpus of the pre-trained conversation model to 122 datasets from 15 dialog-related tasks, resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful foundation model for various conversational tasks and different dialog systems. Extensive experiments have shown that UniPCM is robust to input prompts and capable of various dialog-related tasks. Moreover, UniPCM has strong transfer ability and excels at low resource scenarios, achieving SOTA results on 9 different datasets ranging from task-oriented dialog to open-domain conversation. Furthermore, we are amazed to find that TAP can generate prompts on par with those collected with crowdsourcing. The code is released with the paper

    Bilateral Personalized Dialogue Generation with Dynamic Persona-Aware Fusion

    Full text link
    Generating personalized responses is one of the major challenges in natural human-robot interaction. Current researches in this field mainly focus on generating responses consistent with the robot's pre-assigned persona, while ignoring the user's persona. Such responses may be inappropriate or even offensive, which may lead to the bad user experience. Therefore, we propose a bilateral personalized dialogue generation (BPDG) method with dynamic persona-aware fusion via multi-task transfer learning to generate responses consistent with both personas. The proposed method aims to accomplish three learning tasks: 1) an encoder is trained with dialogue utterances added with corresponded personalized attributes and relative position (language model task), 2) a dynamic persona-aware fusion module predicts the persona presence to adaptively fuse the contextual and bilateral personas encodings (persona prediction task) and 3) a decoder generates natural, fluent and personalized responses (dialogue generation task). To make the generated responses more personalized and bilateral persona-consistent, the Conditional Mutual Information Maximum (CMIM) criterion is adopted to select the final response from the generated candidates. The experimental results show that the proposed method outperforms several state-of-the-art methods in terms of both automatic and manual evaluations.Comment: 14 pages, 6 figure

    Evaluating Human-Language Model Interaction

    Full text link
    Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation.Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI

    A Descriptive Case Study of the Perceptions and Use of Adventist EDGE : An Initiative Developed in Response to the North American Division of Seventh-day Adventists\u27 Document, Journey to Excellence

    Get PDF
    Problem. The Southern Union started the Adventist EDGE initiative as an action plan in response to the North American Division\u27s document, Journey to Excellence . The Adventist EDGE became a comprehensive educational reform initiative. However, there were different ideas on how the innovation should look when inaction in the schools, and these differences became obvious during the initial EDGE school validation visit, resulting in hurt feelings and confusion. Thus, the need for my study to clarify EDGE became critical for the survival of the initiative. Purpose. The purpose of my study was to develop two operational definitions or Innovation Configurations for the Adventist EDGE teacher and the Adventist EDGE School. This would identify the core components of the Adventist EDGE and provide descriptions of behaviors ranging from Ideal to Unacceptable within each component. Method. My study was a qualitative case study, specifically an Innovation Configuration study. It involved eight states in the Southeast that make up the Southern Union Conference of Seventh-day Adventists. There were 42 participants from the eight conferences within the Southern Union Conference representing 20 developers, seven expert users, and 12 users of various levels of use, which included representation of all grade-level teachers K through 12. Results . Two operational definitions or Innovation Configurations were developed, one was for the EDGE Teacher, and the other was for the EDGE School. Key components were identified for both the teacher and the school. The teacher Innovation Configuration has six core components. Under each component are several elements with a continuum of behaviors grouped into three categories: ideal, acceptable, and unacceptable. The school Innovation Configuration has five core components. Under each of those components are several elements with a continuum of behaviors grouped into four categories: ideal, progressing, emerging, and unacceptable. These two innovations define behaviors present in an Adventist EDGE School or Adventist EDGE Teacher. Conclusions. Prior to my study, the Southern Union had no clear definition of specific behaviors for the Adventist EDGE School or Adventist EDGE Teacher. Everyone had his or her own ideas of what EDGE should and should not look like. Using the Innovation Configuration Tool from the Concerns-Based Adoption Model helped to unify the Southern Union Developers of Adventist EDGE. Through a collaborative process, it clarified what an Adventist EDGE Teacher and an Adventist EDGE School looks like when implemented in the classroom or school.The development of the Adventist EDGE Innovation Configuration-Teacher Components and the Adventist EDGE Innovation Configuration-School Components has helped to pull the different viewpoints and ideas of everyone into a focused picture where key players have all agreed. These two Innovation Configurations now provide direction, increasing the chances of sustaining the Adventist EDGE initiative. This study provides a baseline for a host of further studies. Some of those studies might include developing the Innovation Configurations for the conference and union levels. Conducting a comparison study between atypical, good Seventh-day Adventist school and an Adventist EDGE School of Excellence could help determine if the EDGE program is making a difference. Conducting longitudinal studies of student achievement in Adventist EDGE Schools of Excellence and determining if the Adventist EDGE is meeting the needs of Seventh-day Adventist education for the 21st century as outlined in the North American Division\u27s Journey to Excellence are just a few of the studies that can now be conducted

    Representation learning for dialogue systems

    Full text link
    Cette thèse présente une série de mesures prises pour étudier l’apprentissage de représentations (par exemple, l’apprentissage profond) afin de mettre en place des systèmes de dialogue et des agents de conversation virtuels. La thèse est divisée en deux parties générales. La première partie de la thèse examine l’apprentissage des représentations pour les modèles de dialogue génératifs. Conditionnés sur une séquence de tours à partir d’un dialogue textuel, ces modèles ont la tâche de générer la prochaine réponse appropriée dans le dialogue. Cette partie de la thèse porte sur les modèles séquence-à-séquence, qui est une classe de réseaux de neurones profonds génératifs. Premièrement, nous proposons un modèle d’encodeur-décodeur récurrent hiérarchique ("Hierarchical Recurrent Encoder-Decoder"), qui est une extension du modèle séquence-à-séquence traditionnel incorporant la structure des tours de dialogue. Deuxièmement, nous proposons un modèle de réseau de neurones récurrents multi-résolution ("Multiresolution Recurrent Neural Network"), qui est un modèle empilé séquence-à-séquence avec une représentation stochastique intermédiaire (une "représentation grossière") capturant le contenu sémantique abstrait communiqué entre les locuteurs. Troisièmement, nous proposons le modèle d’encodeur-décodeur récurrent avec variables latentes ("Latent Variable Recurrent Encoder-Decoder"), qui suivent une distribution normale. Les variables latentes sont destinées à la modélisation de l’ambiguïté et l’incertitude qui apparaissent naturellement dans la communication humaine. Les trois modèles sont évalués et comparés sur deux tâches de génération de réponse de dialogue: une tâche de génération de réponses sur la plateforme Twitter et une tâche de génération de réponses de l’assistance technique ("Ubuntu technical response generation task"). La deuxième partie de la thèse étudie l’apprentissage de représentations pour un système de dialogue utilisant l’apprentissage par renforcement dans un contexte réel. Cette partie porte plus particulièrement sur le système "Milabot" construit par l’Institut québécois d’intelligence artificielle (Mila) pour le concours "Amazon Alexa Prize 2017". Le Milabot est un système capable de bavarder avec des humains sur des sujets populaires à la fois par la parole et par le texte. Le système consiste d’un ensemble de modèles de récupération et de génération en langage naturel, comprenant des modèles basés sur des références, des modèles de sac de mots et des variantes des modèles décrits ci-dessus. Cette partie de la thèse se concentre sur la tâche de sélection de réponse. À partir d’une séquence de tours de dialogues et d’un ensemble des réponses possibles, le système doit sélectionner une réponse appropriée à fournir à l’utilisateur. Une approche d’apprentissage par renforcement basée sur un modèle appelée "Bottleneck Simulator" est proposée pour sélectionner le candidat approprié pour la réponse. Le "Bottleneck Simulator" apprend un modèle approximatif de l’environnement en se basant sur les trajectoires de dialogue observées et le "crowdsourcing", tout en utilisant un état abstrait représentant la sémantique du discours. Le modèle d’environnement est ensuite utilisé pour apprendre une stratégie d’apprentissage du renforcement par le biais de simulations. La stratégie apprise a été évaluée et comparée à des approches concurrentes via des tests A / B avec des utilisateurs réel, où elle démontre d’excellente performance.This thesis presents a series of steps taken towards investigating representation learning (e.g. deep learning) for building dialogue systems and conversational agents. The thesis is split into two general parts. The first part of the thesis investigates representation learning for generative dialogue models. Conditioned on a sequence of turns from a text-based dialogue, these models are tasked with generating the next, appropriate response in the dialogue. This part of the thesis focuses on sequence-to-sequence models, a class of generative deep neural networks. First, we propose the Hierarchical Recurrent Encoder-Decoder model, which is an extension of the vanilla sequence-to sequence model incorporating the turn-taking structure of dialogues. Second, we propose the Multiresolution Recurrent Neural Network model, which is a stacked sequence-to-sequence model with an intermediate, stochastic representation (a "coarse representation") capturing the abstract semantic content communicated between the dialogue speakers. Third, we propose the Latent Variable Recurrent Encoder-Decoder model, which is a variant of the Hierarchical Recurrent Encoder-Decoder model with latent, stochastic normally-distributed variables. The latent, stochastic variables are intended for modelling the ambiguity and uncertainty occurring naturally in human language communication. The three models are evaluated and compared on two dialogue response generation tasks: a Twitter response generation task and the Ubuntu technical response generation task. The second part of the thesis investigates representation learning for a real-world reinforcement learning dialogue system. Specifically, this part focuses on the Milabot system built by the Quebec Artificial Intelligence Institute (Mila) for the Amazon Alexa Prize 2017 competition. Milabot is a system capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language retrieval and generation models, including template-based models, bag-of-words models, and variants of the models discussed in the first part of the thesis. This part of the thesis focuses on the response selection task. Given a sequence of turns from a dialogue and a set of candidate responses, the system must select an appropriate response to give the user. A model-based reinforcement learning approach, called the Bottleneck Simulator, is proposed for selecting the appropriate candidate response. The Bottleneck Simulator learns an approximate model of the environment based on observed dialogue trajectories and human crowdsourcing, while utilizing an abstract (bottleneck) state representing high-level discourse semantics. The learned environment model is then employed to learn a reinforcement learning policy through rollout simulations. The learned policy has been evaluated and compared to competing approaches through A/B testing with real-world users, where it was found to yield excellent performance
    • …
    corecore