25,073 research outputs found
Chatbots as Advisers: the Effects of Response Variability and Reply Suggestion Buttons
As chatbots gain popularity across a variety of applications, from
investment to health, they employ an increasing number of features
that can influence the perception of the system. Since chatbots often provide advice or guidance, we ask: do these aspects affect the
user’s decision to follow their advice? We focus on two chatbot
features that can influence user perception: 1) response variability
in answers and delays and 2) reply suggestion buttons. We report
on a between-subject study where participants made investment
decisions on a simulated social trading platform by interacting with
a chatbot providing advice. Performance-based study incentives
made the consequences of following the advice tangible to participants. We measured how often and to what extent participants
followed the chatbot’s advice compared to an alternative source
of information. Results indicate that both response variability and
reply suggestion buttons significantly increased the inclination to
follow the advice of the chatbot
End-to-End Autoregressive Retrieval via Bootstrapping for Smart Reply Systems
Reply suggestion systems represent a staple component of many instant
messaging and email systems. However, the requirement to produce sets of
replies, rather than individual replies, makes the task poorly suited for
out-of-the-box retrieval architectures, which only consider individual
message-reply similarity. As a result, these system often rely on additional
post-processing modules to diversify the outputs. However, these approaches are
ultimately bottlenecked by the performance of the initial retriever, which in
practice struggles to present a sufficiently diverse range of options to the
downstream diversification module, leading to the suggestions being less
relevant to the user. In this paper, we consider a novel approach that
radically simplifies this pipeline through an autoregressive text-to-text
retrieval model, that learns the smart reply task end-to-end from a dataset of
(message, reply set) pairs obtained via bootstrapping. Empirical results show
this method consistently outperforms a range of state-of-the-art baselines
across three datasets, corresponding to a 5.1%-17.9% improvement in relevance,
and a 0.5%-63.1% improvement in diversity compared to the best baseline
approach. We make our code publicly available.Comment: FINDINGS-EMNLP 202
- …