150 research outputs found
Towards Open Domain Chatbots - A GRU Architecture for Data Driven Conversations.
Understanding of textual content, such as topic and intent recognition, is a critical part of chatbots, allowing the chatbot to provide relevant responses. Although successful in several narrow domains, the potential diversity of content in broader and more open domains renders traditional pattern recognition techniques inaccurate. In this paper, we propose a novel deep learning architecture for content recognition that consists of multiple levels of gated recurrent units (GRUs). The architecture is designed to capture complex sentence structure at multiple levels of abstraction, seeking content recognition for very wide domains, through a distributed scalable representation of content. To evaluate our architecture, we have compiled 10 years of questions and answers from a youth information service, 200 083 questions spanning a wide range of content, altogether 289 topics, involving law, health, and social issues. Despite the relatively open domain data set, our architecture is able to accurately categorize the 289 intents and topics. Indeed, it provides roughly an order of magnitude higher accuracy compared to more classical content recognition techniques, such as SVM, Naive Bayes, random forest, and K-nearest neighbor, which all seem to fail on this challenging open domain dataset
Motivations, Classification and Model Trial of Conversational Agents for Insurance Companies
Advances in artificial intelligence have renewed interest in conversational
agents. So-called chatbots have reached maturity for industrial applications.
German insurance companies are interested in improving their customer service
and digitizing their business processes. In this work we investigate the
potential use of conversational agents in insurance companies by determining
which classes of agents are of interest to insurance companies, finding
relevant use cases and requirements, and developing a prototype for an
exemplary insurance scenario. Based on this approach, we derive key findings
for conversational agent implementation in insurance companies.Comment: 12 pages, 6 figure, accepted for presentation at The International
Conference on Agents and Artificial Intelligence 2019 (ICAART 2019
Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models
Neural conversational models require substantial amounts of dialogue data for
their parameter estimation and are therefore usually learned on large corpora
such as chat forums or movie subtitles. These corpora are, however, often
challenging to work with, notably due to their frequent lack of turn
segmentation and the presence of multiple references external to the dialogue
itself. This paper shows that these challenges can be mitigated by adding a
weighting model into the architecture. The weighting model, which is itself
estimated from dialogue data, associates each training example to a numerical
weight that reflects its intrinsic quality for dialogue modelling. At training
time, these sample weights are included into the empirical loss to be
minimised. Evaluation results on retrieval-based models trained on movie and TV
subtitles demonstrate that the inclusion of such a weighting model improves the
model performance on unsupervised metrics.Comment: Accepted to SIGDIAL 201
A Primer on Seq2Seq Models for Generative Chatbots
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This paper examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status
- …