5,338 research outputs found
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
A capsule is a group of neurons, whose activity vector represents the
instantiation parameters of a specific type of entity. In this paper, we
explore the capsule networks used for relation extraction in a multi-instance
multi-label learning framework and propose a novel neural approach based on
capsule networks with attention mechanisms. We evaluate our method with
different benchmarks, and it is demonstrated that our method improves the
precision of the predicted relations. Particularly, we show that capsule
networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
Deep Short Text Classification with Knowledge Powered Attention
Short text classification is one of important tasks in Natural Language
Processing (NLP). Unlike paragraphs or documents, short texts are more
ambiguous since they have not enough contextual information, which poses a
great challenge for classification. In this paper, we retrieve knowledge from
external knowledge source to enhance the semantic representation of short
texts. We take conceptual information as a kind of knowledge and incorporate it
into deep neural networks. For the purpose of measuring the importance of
knowledge, we introduce attention mechanisms and propose deep Short Text
Classification with Knowledge powered Attention (STCKA). We utilize Concept
towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS)
attention to acquire the weight of concepts from two aspects. And we classify a
short text with the help of conceptual information. Unlike traditional
approaches, our model acts like a human being who has intrinsic ability to make
decisions based on observation (i.e., training data for machines) and pays more
attention to important knowledge. We also conduct extensive experiments on four
public datasets for different tasks. The experimental results and case studies
show that our model outperforms the state-of-the-art methods, justifying the
effectiveness of knowledge powered attention
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
- …