31 research outputs found

    A Survey of Personality, Persona, and Profile in Conversational Agents and Chatbots

    Full text link
    We present a review of personality in neural conversational agents (CAs), also called chatbots. First, we define Personality, Persona, and Profile. We explain all personality schemes which have been used in CAs, and list models under the scheme(s) which they use. Second we describe 21 datasets which have been developed in recent CA personality research. Third, we define the methods used to embody personality in a CA, and review recent models using them. Fourth, we survey some relevant reviews on CAs, personality, and related topics. Finally, we draw conclusions and identify some research challenges for this important emerging field.Comment: 25 pages, 6 tables, 207 reference

    Chatbots: Security, privacy, data protection, and social aspects

    Get PDF
    Chatbots are artificial communication systems becoming increasingly popular and not all their security questions are clearly solved. People use chatbots for assistance in shopping, bank communication, meal delivery, healthcare, cars, and many other actions. However, it brings an additional security risk and creates serious security challenges which have to be handled. Understanding the underlying problems requires defining the crucial steps in the techniques used to design chatbots related to security. There are many factors increasing security threats and vulnerabilities. All of them are comprehensively studied, and security practices to decrease security weaknesses are presented. Modern chatbots are no longer rule-based models, but they employ modern natural language and machine learning techniques. Such techniques learn from a conversation, which can contain personal information. The paper discusses circumstances under which such data can be used and how chatbots treat them. Many chatbots operate on a social/messaging platform, which has their terms and conditions about data. The paper aims to present a comprehensive study of security aspects in communication with chatbots. The article could open a discussion and highlight the problems of data storage and usage obtained from the communication user-chatbot and propose some standards to protect the user.Web of Scienceart. no. e642

    Arabic Educational Neural Network Chatbot

    Get PDF
    Chatbots (machine-based conversational systems) have grown in popularity in recent years. Chatbots powered by artificial intelligence (AI) are sophisticated technologies that replicate human communication in a range of natural languages. A chatbot’s primary purpose is to interpret user inquiries and give relevant, contextual responses. Chatbot success has been extensively reported in a number of widely spoken languages; nonetheless, chatbots have not yet reached the predicted degree of success in Arabic. In recent years, several academics have worked to solve the challenges of creating Arabic chatbots. Furthermore, the development of Arabic chatbots is critical to our attempts to increase the use of the language in academic contexts. Our objective is to install and create an Arabic chatbot that will help the Arabic language in the area of education. To begin implementing the chabot, we collected datasets from Arabic educational websites and had to prepare these data using the NLP methods. We then used this data to train the system using a neural network model to create an Arabic neural network chabot. Furthermore, we found relevant research, conducted earlier investigations, and compared their findings by searching Google scholar and looking through the linked references. Data was gathered and saved in a json file. Finally, we programmed the chabot and the models in Python. As a consequence, an Arabic chatbot answers all questions about educational regulations in the United Arab Emirates

    Representation learning for dialogue systems

    Full text link
    Cette thèse présente une série de mesures prises pour étudier l’apprentissage de représentations (par exemple, l’apprentissage profond) afin de mettre en place des systèmes de dialogue et des agents de conversation virtuels. La thèse est divisée en deux parties générales. La première partie de la thèse examine l’apprentissage des représentations pour les modèles de dialogue génératifs. Conditionnés sur une séquence de tours à partir d’un dialogue textuel, ces modèles ont la tâche de générer la prochaine réponse appropriée dans le dialogue. Cette partie de la thèse porte sur les modèles séquence-à-séquence, qui est une classe de réseaux de neurones profonds génératifs. Premièrement, nous proposons un modèle d’encodeur-décodeur récurrent hiérarchique ("Hierarchical Recurrent Encoder-Decoder"), qui est une extension du modèle séquence-à-séquence traditionnel incorporant la structure des tours de dialogue. Deuxièmement, nous proposons un modèle de réseau de neurones récurrents multi-résolution ("Multiresolution Recurrent Neural Network"), qui est un modèle empilé séquence-à-séquence avec une représentation stochastique intermédiaire (une "représentation grossière") capturant le contenu sémantique abstrait communiqué entre les locuteurs. Troisièmement, nous proposons le modèle d’encodeur-décodeur récurrent avec variables latentes ("Latent Variable Recurrent Encoder-Decoder"), qui suivent une distribution normale. Les variables latentes sont destinées à la modélisation de l’ambiguïté et l’incertitude qui apparaissent naturellement dans la communication humaine. Les trois modèles sont évalués et comparés sur deux tâches de génération de réponse de dialogue: une tâche de génération de réponses sur la plateforme Twitter et une tâche de génération de réponses de l’assistance technique ("Ubuntu technical response generation task"). La deuxième partie de la thèse étudie l’apprentissage de représentations pour un système de dialogue utilisant l’apprentissage par renforcement dans un contexte réel. Cette partie porte plus particulièrement sur le système "Milabot" construit par l’Institut québécois d’intelligence artificielle (Mila) pour le concours "Amazon Alexa Prize 2017". Le Milabot est un système capable de bavarder avec des humains sur des sujets populaires à la fois par la parole et par le texte. Le système consiste d’un ensemble de modèles de récupération et de génération en langage naturel, comprenant des modèles basés sur des références, des modèles de sac de mots et des variantes des modèles décrits ci-dessus. Cette partie de la thèse se concentre sur la tâche de sélection de réponse. À partir d’une séquence de tours de dialogues et d’un ensemble des réponses possibles, le système doit sélectionner une réponse appropriée à fournir à l’utilisateur. Une approche d’apprentissage par renforcement basée sur un modèle appelée "Bottleneck Simulator" est proposée pour sélectionner le candidat approprié pour la réponse. Le "Bottleneck Simulator" apprend un modèle approximatif de l’environnement en se basant sur les trajectoires de dialogue observées et le "crowdsourcing", tout en utilisant un état abstrait représentant la sémantique du discours. Le modèle d’environnement est ensuite utilisé pour apprendre une stratégie d’apprentissage du renforcement par le biais de simulations. La stratégie apprise a été évaluée et comparée à des approches concurrentes via des tests A / B avec des utilisateurs réel, où elle démontre d’excellente performance.This thesis presents a series of steps taken towards investigating representation learning (e.g. deep learning) for building dialogue systems and conversational agents. The thesis is split into two general parts. The first part of the thesis investigates representation learning for generative dialogue models. Conditioned on a sequence of turns from a text-based dialogue, these models are tasked with generating the next, appropriate response in the dialogue. This part of the thesis focuses on sequence-to-sequence models, a class of generative deep neural networks. First, we propose the Hierarchical Recurrent Encoder-Decoder model, which is an extension of the vanilla sequence-to sequence model incorporating the turn-taking structure of dialogues. Second, we propose the Multiresolution Recurrent Neural Network model, which is a stacked sequence-to-sequence model with an intermediate, stochastic representation (a "coarse representation") capturing the abstract semantic content communicated between the dialogue speakers. Third, we propose the Latent Variable Recurrent Encoder-Decoder model, which is a variant of the Hierarchical Recurrent Encoder-Decoder model with latent, stochastic normally-distributed variables. The latent, stochastic variables are intended for modelling the ambiguity and uncertainty occurring naturally in human language communication. The three models are evaluated and compared on two dialogue response generation tasks: a Twitter response generation task and the Ubuntu technical response generation task. The second part of the thesis investigates representation learning for a real-world reinforcement learning dialogue system. Specifically, this part focuses on the Milabot system built by the Quebec Artificial Intelligence Institute (Mila) for the Amazon Alexa Prize 2017 competition. Milabot is a system capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language retrieval and generation models, including template-based models, bag-of-words models, and variants of the models discussed in the first part of the thesis. This part of the thesis focuses on the response selection task. Given a sequence of turns from a dialogue and a set of candidate responses, the system must select an appropriate response to give the user. A model-based reinforcement learning approach, called the Bottleneck Simulator, is proposed for selecting the appropriate candidate response. The Bottleneck Simulator learns an approximate model of the environment based on observed dialogue trajectories and human crowdsourcing, while utilizing an abstract (bottleneck) state representing high-level discourse semantics. The learned environment model is then employed to learn a reinforcement learning policy through rollout simulations. The learned policy has been evaluated and compared to competing approaches through A/B testing with real-world users, where it was found to yield excellent performance

    ANICE : An Artificial Neuro-Linguistic Interactive Computer Entity

    Get PDF
    Mental health problems are hard to talk about, especially when the questions asked do not allow the individual to answer freely. That is the case for most inquiries, where questions usually request very restricted answers like yes or no. This thesis proposes a chatbot that tries to avoid the problem of restricting users to small answers. The chatbot will focus on people feeling burned out due to stress related to their studies. The chatbot tries to replicate two forms with questions about burnout that are used as guidelines. Both these forms are developed based on questions done by psychologists. Because rule-based chatbots have a limited vocabulary, natural language understand- ing and neural-based techniques are tested and validated to see if the chatbot performs well using these techniques. The techniques tested are word2vec and spacy components. The evaluation results show that it is feasible to implement a chatbot that uses rules and also techniques for natural language processing. Additionally, the tests did indicate that both spacy and word2vec are great resources for NLU. Word2vec proves to perform slightly better at specific times related to identifying intents that are domain-specific. Finally, the results from the users experience show that this is a promising work that could help students dealing with burnout.Problemas de saúde mental são um tema difícil de abordar, especialmente quando as perguntas feitas não permitem ao indivíduo responder livremente. Este é o caso da maio- ria dos inquéritos, onde as perguntas geralmente exigem respostas muito restritas, como sim ou não. Esta tese propõe um chatbot que tenta evitar o problema de restringir os utilizadores a pequenas respostas. O chatbot concentrar-se-á em utilizadores que se sen- tem esgotados devido ao stress relacionado com os estudos. O chatbot tenta replicar dois formulários com perguntas sobre burnout, isto é, estes formulários são utilizados como diretrizes. Ambos os formulários são desenvolvidos com base em perguntas feitas por psicólogos. Como os chatbots baseados em regras têm um vocabulário limitado, a compreensão da linguagem natural e as técnicas baseadas em redes neuronais são testadas e validadas para ver se o chatbot tem um bom funcionamento utilizando essas técnicas. As técnicas baseadas em redes neuronais que são testadas são o word2vec e componentes spacy. Os resultados da avaliação mostram que é viável implementar um chatbot que uti- lize regras e também técnicas de processamento de linguagem natural. Além disso, os testes indicam que tanto os componentes spacy quanto o word2vec são ótimos recursos para processamento de linguagem natural. O Word2vec tem um desempenho um pouco melhor em momentos específicos relacionados à identificação de intenções do domínio de estudo. Por fim, os resultados da experiência dos utilizadores mostram que este é um trabalho promissor que pode ajudar os utilizadores a lidar com o burnout
    corecore