30,393 research outputs found
A retrieval-based dialogue system utilizing utterance and context embeddings
Finding semantically rich and computer-understandable representations for
textual dialogues, utterances and words is crucial for dialogue systems (or
conversational agents), as their performance mostly depends on understanding
the context of conversations. Recent research aims at finding distributed
vector representations (embeddings) for words, such that semantically similar
words are relatively close within the vector-space. Encoding the "meaning" of
text into vectors is a current trend, and text can range from words, phrases
and documents to actual human-to-human conversations. In recent research
approaches, responses have been generated utilizing a decoder architecture,
given the vector representation of the current conversation. In this paper, the
utilization of embeddings for answer retrieval is explored by using
Locality-Sensitive Hashing Forest (LSH Forest), an Approximate Nearest Neighbor
(ANN) model, to find similar conversations in a corpus and rank possible
candidates. Experimental results on the well-known Ubuntu Corpus (in English)
and a customer service chat dataset (in Dutch) show that, in combination with a
candidate selection method, retrieval-based approaches outperform generative
ones and reveal promising future research directions towards the usability of
such a system.Comment: A shorter version is accepted at ICMLA2017 conference;
acknowledgement added; typos correcte
MojiTalk: Generating Emotional Responses at Scale
Generating emotional language is a key step towards building empathetic
natural language processing agents. However, a major challenge for this line of
research is the lack of large-scale labeled training data, and previous studies
are limited to only small sets of human annotated sentiment labels.
Additionally, explicitly controlling the emotion and sentiment of generated
text is also difficult. In this paper, we take a more radical approach: we
exploit the idea of leveraging Twitter data that are naturally labeled with
emojis. More specifically, we collect a large corpus of Twitter conversations
that include emojis in the response, and assume the emojis convey the
underlying emotions of the sentence. We then introduce a reinforced conditional
variational encoder approach to train a deep generative model on these
conversations, which allows us to use emojis to control the emotion of the
generated text. Experimentally, we show in our quantitative and qualitative
analyses that the proposed models can successfully generate high-quality
abstractive conversation responses in accordance with designated emotions
- …