Search CORE

228 research outputs found

SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

Author: Boureau Y-Lan
Ung Megan
Xu Jing
Publication venue
Publication date: 04/05/2022
Field of study

Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an unpleasant experience and may discourage conversation partners from giving feedback in the future. This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures. We collect a dataset of 10k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation, without sacrificing engagingness or general conversational ability.Comment: Accepted at ACL 202

arXiv.org e-Print Archive

Detecting Inspiring Content on Social Media

Author: Boureau Y-Lan
Halevy Alon
Ignat Oana
Yu Jane A.
Publication venue
Publication date: 29/05/2023
Field of study

Inspiration moves a person to see new possibilities and transforms the way they perceive their own potential. Inspiration has received little attention in psychology, and has not been researched before in the NLP community. To the best of our knowledge, this work is the first to study inspiration through machine learning methods. We aim to automatically detect inspiring content from social media data. To this end, we analyze social media posts to tease out what makes a post inspiring and what topics are inspiring. We release a dataset of 5,800 inspiring and 5,800 non-inspiring English-language public post unique ids collected from a dump of Reddit public posts made available by a third party and use linguistic heuristics to automatically detect which social media English-language posts are inspiring.Comment: accepted at ACII 202

arXiv.org e-Print Archive

Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

Author: Boureau Y-Lan
Dinan Emily
Mielke Sabrina J.
Szlam Arthur
Publication venue
Publication date: 29/12/2020
Field of study

Open-domain dialogue agents have vastly improved, but still confidently hallucinate knowledge or express doubt when asked straightforward questions. In this work, we analyze whether state-of-the-art chit-chat models can express metacognition capabilities through their responses: does a verbalized expression of doubt (or confidence) match the likelihood that the model's answer is incorrect (or correct)? We find that these models are poorly calibrated in this sense, yet we show that the representations within the models can be used to accurately predict likelihood of correctness. By incorporating these correctness predictions into the training of a controllable generation model, we obtain a dialogue agent with greatly improved linguistic calibration

arXiv.org e-Print Archive