11,311 research outputs found
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory
Perception and expression of emotion are key factors to the success of
dialogue systems or conversational agents. However, this problem has not been
studied in large-scale conversation generation so far. In this paper, we
propose Emotional Chatting Machine (ECM) that can generate appropriate
responses not only in content (relevant and grammatical) but also in emotion
(emotionally consistent). To the best of our knowledge, this is the first work
that addresses the emotion factor in large-scale conversation generation. ECM
addresses the factor using three new mechanisms that respectively (1) models
the high-level abstraction of emotion expressions by embedding emotion
categories, (2) captures the change of implicit internal emotion states, and
(3) uses explicit emotion expressions with an external emotion vocabulary.
Experiments show that the proposed model can generate responses appropriate not
only in content but also in emotion.Comment: Accepted in AAAI 201
MojiTalk: Generating Emotional Responses at Scale
Generating emotional language is a key step towards building empathetic
natural language processing agents. However, a major challenge for this line of
research is the lack of large-scale labeled training data, and previous studies
are limited to only small sets of human annotated sentiment labels.
Additionally, explicitly controlling the emotion and sentiment of generated
text is also difficult. In this paper, we take a more radical approach: we
exploit the idea of leveraging Twitter data that are naturally labeled with
emojis. More specifically, we collect a large corpus of Twitter conversations
that include emojis in the response, and assume the emojis convey the
underlying emotions of the sentence. We then introduce a reinforced conditional
variational encoder approach to train a deep generative model on these
conversations, which allows us to use emojis to control the emotion of the
generated text. Experimentally, we show in our quantitative and qualitative
analyses that the proposed models can successfully generate high-quality
abstractive conversation responses in accordance with designated emotions
On stopwords, filtering and data sparsity for sentiment analysis of Twitter
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier’s feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
- …