43 research outputs found

    A Personal Conversation Assistant Based on Seq2seq with Word2vec Cognitive Map

    Get PDF
    WeChat is one of social network applications that connects people widely. Huge data is generated when users conduct conversations, which can be used to enhance their lives. This paper will describe how this data is collected, how to develop a personalized chatbot using personal conversation records. Our system will have a cognitive map based on the word2vec model, which is used to learn and store the relationship of each word that appears in the chatting records. Each word will be mapped to a continuous high dimensional vector space. Then the sequence-to-sequence framework (seq2seq) will be adopted to learn the chatting styles from all pairs of chatting sentences. Meanwhile, the traditional one-hot embedding layer will be replaced with our word2vec embedding layer in the seq2seq model. Furthermore, an autoencoder of seq2seq architecture is trained to learn the vector representation of each sentence, then the cosine similarity between model generated response and the pre-existing response in test set can be evaluated , and the distance with principal component analysis (PCA) projection can be also displayed. As a result, our word2vec embedded seq2seq model significantly outperforms the one-hot embedded one

    深層学習に基づく感情会話分析に関する研究

    Get PDF
    Owning the capability to express specific emotions by a chatbot during a conversation is one of the key parts of artificial intelligence, which has an intuitive and quantifiable impact on the improvement of chatbot’s usability and user satisfaction. Enabling machines to emotion recognition in conversation is challenging, mainly because the information in human dialogue innately conveys emotions by long-term experience, abundant knowledge, context, and the intricate patterns between the affective states. Recently, many studies on neural emotional conversational models have been conducted. However, enabling the chatbot to control what kind of emotion to respond to upon its own characters in conversation is still underexplored. At this stage, people are no longer satisfied with using a dialogue system to solve specific tasks, and are more eager to achieve spiritual communication. In the chat process, if the robot can perceive the user's emotions and can accurately process them, it can greatly enrich the content of the dialogue and make the user empathize. In the process of emotional dialogue, our ultimate goal is to make the machine understand human emotions and give matching responses. Based on these two points, this thesis explores and in-depth emotion recognition in conversation task and emotional dialogue generation task. In the past few years, although considerable progress has been made in emotional research in dialogue, there are still some difficulties and challenges due to the complex nature of human emotions. The key contributions in this thesis are summarized as below: (1) Researchers have paid more attention to enhancing natural language models with knowledge graphs these days, since knowledge graph has gained a lot of systematic knowledge. A large number of studies had shown that the introduction of external commonsense knowledge is very helpful to improve the characteristic information. We address the task of emotion recognition in conversations using external knowledge to enhance semantics. In this work, we employ an external knowledge graph ATOMIC to extract the knowledge sources. We proposed KES model, a new framework that incorporates different elements of external knowledge and conversational semantic role labeling, where build upon them to learn interactions between interlocutors participating in a conversation. The conversation is a sequence of coherent and orderly discourses. For neural networks, the capture of long-range context information is a weakness. We adopt Transformer a structure composed of self-attention and feed forward neural network, instead of the traditional RNN model, aiming at capturing remote context information. We design a self-attention layer specialized for enhanced semantic text features with external commonsense knowledge. Then, two different networks composed of LSTM are responsible for tracking individual internal state and context external state. In addition, the proposed model has experimented on three datasets in emotion detection in conversation. The experimental results show that our model outperforms the state-of-the-art approaches on most of the tested datasets. (2) We proposed an emotional dialogue model based on Seq2Seq, which is improved from three aspects: model input, encoder structure, and decoder structure, so that the model can generate responses with rich emotions, diversity, and context. In terms of model input, emotional information and location information are added based on word vectors. In terms of the encoder, the proposed model first encodes the current input and sentence sentiment to generate a semantic vector, and additionally encodes the context and sentence sentiment to generate a context vector, adding contextual information while ensuring the independence of the current input. On the decoder side, attention is used to calculate the weights of the two semantic vectors separately and then decode, to fully integrate the local emotional semantic information and the global emotional semantic information. We used seven objective evaluation indicators to evaluate the model's generation results, context similarity, response diversity, and emotional response. Experimental results show that the model can generate diverse responses with rich sentiment, contextual associations

    A Conversational System with Enhanced Emotion Expression by Using Emoji

    Get PDF
    Wechat, a Chinese messaging app, being used by over a billion users monthly. Messaging apps like Wechat use robot systems to predictively generate accurate conversational responses for users. However, users often prefer to convey the emotional expression necessary for human communication in pictorial form with Emoji. In this paper I have used the existing system to generate responses primarily with sentimental analysis, with which the primary text-based reply is fused with emoji stickers so as to be able to convey a more accurate emotional response. This was accomplished through carrying out an emotional analysis of the input text which is then used to select the appropriate emoji stickers to make a response. This analysis also accounts for age and emotional consistency in selecting the most appropriate emoji to convey emotion. As a result, the system is enhanced by facilitating richer communication between users, who are able to convey their actual emotions more swiftly. In this paper, some conversation cases are run using the enhanced system and compared with the existing original system

    Emotion-Aware and Human-Like Autonomous Agents

    Get PDF
    In human-computer interaction (HCI), one of the technological goals is to build human-like artificial agents that can think, decide and behave like humans during the interaction. A prime example is a dialogue system, where the agent should converse fluently and coherently with a user and connect with them emotionally. Humanness and emotion-awareness of interactive artificial agents have been shown to improve user experience and help attain application-specific goals more quickly. However, achieving human-likeness in HCI systems is contingent on addressing several philosophical and scientific challenges. In this thesis, I address two such challenges: replicating the human ability to 1) correctly perceive and adopt emotions, and 2) communicate effectively through language. Several research studies in neuroscience, economics, psychology and sociology show that both language and emotional reasoning are essential to the human cognitive deliberation process. These studies establish that any human-like AI should necessarily be equipped with adequate emotional and linguistic cognizance. To this end, I explore the following research directions. - I study how agents can reason emotionally in various human-interactive settings for decision-making. I use Bayesian Affect Control Theory, a probabilistic model of human-human affective interactions, to build a decision-theoretic reasoning algorithm about affect. This approach is validated on several applications: two-person social dilemma games, an assistive healthcare device, and robot navigation. - I develop several techniques to understand and generate emotions/affect in language. The proposed methods include affect-based feature augmentation of neural conversational models, training regularization using affective objectives, and affectively diverse sequential inference. - I devise an active learning technique that elicits user feedback during a conversation. This enables the agent to learn in real time, and to produce natural and coherent language during the interaction. - I explore incremental domain adaptation in language classification and generation models. The proposed method seeks to replicate the human ability to continually learn from new environments without forgetting old experiences

    Neural approaches to dialog modeling

    Full text link
    Cette thèse par article se compose de quatre articles qui contribuent au domaine de l’apprentissage profond, en particulier dans la compréhension et l’apprentissage des ap- proches neuronales des systèmes de dialogue. Le premier article fait un pas vers la compréhension si les architectures de dialogue neuronal couramment utilisées capturent efficacement les informations présentes dans l’historique des conversations. Grâce à une série d’expériences de perturbation sur des ensembles de données de dialogue populaires, nous constatons que les architectures de dialogue neuronal couramment utilisées comme les modèles seq2seq récurrents et basés sur des transformateurs sont rarement sensibles à la plupart des perturbations du contexte d’entrée telles que les énoncés manquants ou réorganisés, les mots mélangés, etc. Le deuxième article propose d’améliorer la qualité de génération de réponse dans les systèmes de dialogue de domaine ouvert en modélisant conjointement les énoncés avec les attributs de dialogue de chaque énoncé. Les attributs de dialogue d’un énoncé se réfèrent à des caractéristiques ou des aspects discrets associés à un énoncé comme les actes de dialogue, le sentiment, l’émotion, l’identité du locuteur, la personnalité du locuteur, etc. Le troisième article présente un moyen simple et économique de collecter des ensembles de données à grande échelle pour modéliser des systèmes de dialogue orientés tâche. Cette approche évite l’exigence d’un schéma d’annotation d’arguments complexes. La version initiale de l’ensemble de données comprend 13 215 dialogues basés sur des tâches comprenant six domaines et environ 8 000 entités nommées uniques, presque 8 fois plus que l’ensemble de données MultiWOZ populaire.This thesis by article consists of four articles which contribute to the field of deep learning, specifically in understanding and learning neural approaches to dialog systems. The first article takes a step towards understanding if commonly used neural dialog architectures effectively capture the information present in the conversation history. Through a series of perturbation experiments on popular dialog datasets, wefindthatcommonly used neural dialog architectures like recurrent and transformer-based seq2seq models are rarely sensitive to most input context perturbations such as missing or reordering utterances, shuffling words, etc. The second article introduces a simple and cost-effective way to collect large scale datasets for modeling task-oriented dialog systems. This approach avoids the requirement of a com-plex argument annotation schema. The initial release of the dataset includes 13,215 task-based dialogs comprising six domains and around 8k unique named entities, almost 8 times more than the popular MultiWOZ dataset. The third article proposes to improve response generation quality in open domain dialog systems by jointly modeling the utterances with the dialog attributes of each utterance. Dialog attributes of an utterance refer to discrete features or aspects associated with an utterance like dialog-acts, sentiment, emotion, speaker identity, speaker personality, etc. The final article introduces an embedding-free method to compute word representations on-the-fly. This approach significantly reduces the memory footprint which facilitates de-ployment in on-device (memory constraints) devices. Apart from being independent of the vocabulary size, we find this approach to be inherently resilient to common misspellings
    corecore