150 research outputs found

    Towards Open Domain Chatbots - A GRU Architecture for Data Driven Conversations.

    Get PDF
    Understanding of textual content, such as topic and intent recognition, is a critical part of chatbots, allowing the chatbot to provide relevant responses. Although successful in several narrow domains, the potential diversity of content in broader and more open domains renders traditional pattern recognition techniques inaccurate. In this paper, we propose a novel deep learning architecture for content recognition that consists of multiple levels of gated recurrent units (GRUs). The architecture is designed to capture complex sentence structure at multiple levels of abstraction, seeking content recognition for very wide domains, through a distributed scalable representation of content. To evaluate our architecture, we have compiled 10 years of questions and answers from a youth information service, 200 083 questions spanning a wide range of content, altogether 289 topics, involving law, health, and social issues. Despite the relatively open domain data set, our architecture is able to accurately categorize the 289 intents and topics. Indeed, it provides roughly an order of magnitude higher accuracy compared to more classical content recognition techniques, such as SVM, Naive Bayes, random forest, and K-nearest neighbor, which all seem to fail on this challenging open domain dataset

    Motivations, Classification and Model Trial of Conversational Agents for Insurance Companies

    Full text link
    Advances in artificial intelligence have renewed interest in conversational agents. So-called chatbots have reached maturity for industrial applications. German insurance companies are interested in improving their customer service and digitizing their business processes. In this work we investigate the potential use of conversational agents in insurance companies by determining which classes of agents are of interest to insurance companies, finding relevant use cases and requirements, and developing a prototype for an exemplary insurance scenario. Based on this approach, we derive key findings for conversational agent implementation in insurance companies.Comment: 12 pages, 6 figure, accepted for presentation at The International Conference on Agents and Artificial Intelligence 2019 (ICAART 2019

    Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models

    Full text link
    Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.Comment: Accepted to SIGDIAL 201

    A Primer on Seq2Seq Models for Generative Chatbots

    Get PDF
    The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This paper examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status
    • …
    corecore