4,091 research outputs found
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
A Deep Relevance Matching Model for Ad-hoc Retrieval
In recent years, deep neural networks have led to exciting breakthroughs in
speech recognition, computer vision, and natural language processing (NLP)
tasks. However, there have been few positive results of deep models on ad-hoc
retrieval tasks. This is partially due to the fact that many important
characteristics of the ad-hoc retrieval task have not been well addressed in
deep models yet. Typically, the ad-hoc retrieval task is formalized as a
matching problem between two pieces of text in existing work using deep models,
and treated equivalent to many NLP tasks such as paraphrase identification,
question answering and automatic conversation. However, we argue that the
ad-hoc retrieval task is mainly about relevance matching while most NLP
matching tasks concern semantic matching, and there are some fundamental
differences between these two matching tasks. Successful relevance matching
requires proper handling of the exact matching signals, query term importance,
and diverse matching requirements. In this paper, we propose a novel deep
relevance matching model (DRMM) for ad-hoc retrieval. Specifically, our model
employs a joint deep architecture at the query term level for relevance
matching. By using matching histogram mapping, a feed forward matching network,
and a term gating network, we can effectively deal with the three relevance
matching factors mentioned above. Experimental results on two representative
benchmark collections show that our model can significantly outperform some
well-known retrieval models as well as state-of-the-art deep matching models.Comment: CIKM 2016, long pape
Neural Ranking Models with Weak Supervision
Despite the impressive improvements achieved by unsupervised deep neural
networks in computer vision and NLP tasks, such improvements have not yet been
observed in ranking for information retrieval. The reason may be the complexity
of the ranking problem, as it is not obvious how to learn from queries and
documents when no supervised signal is available. Hence, in this paper, we
propose to train a neural ranking model using weak supervision, where labels
are obtained automatically without human annotators or any external resources
(e.g., click data). To this aim, we use the output of an unsupervised ranking
model, such as BM25, as a weak supervision signal. We further train a set of
simple yet effective ranking models based on feed-forward neural networks. We
study their effectiveness under various learning scenarios (point-wise and
pair-wise models) and using different input representations (i.e., from
encoding query-document pairs into dense/sparse vectors to using word embedding
representation). We train our networks using tens of millions of training
instances and evaluate it on two standard collections: a homogeneous news
collection(Robust) and a heterogeneous large-scale web collection (ClueWeb).
Our experiments indicate that employing proper objective functions and letting
the networks to learn the input representation based on weakly supervised data
leads to impressive performance, with over 13% and 35% MAP improvements over
the BM25 model on the Robust and the ClueWeb collections. Our findings also
suggest that supervised neural ranking models can greatly benefit from
pre-training on large amounts of weakly labeled data that can be easily
obtained from unsupervised IR models.Comment: In proceedings of The 40th International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR2017
- …