59 research outputs found
A Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect
similar sentences written in natural language is crucial for several
applications, such as text mining, text summarization, plagiarism detection,
authorship authentication and question answering. Given two sentences, the
objective is to detect whether they are semantically identical. An important
insight from this work is that existing paraphrase systems perform well when
applied on clean texts, but they do not necessarily deliver good performance
against noisy texts. Challenges with paraphrase detection on user generated
short texts, such as Twitter, include language irregularity and noise. To cope
with these challenges, we propose a novel deep neural network-based approach
that relies on coarse-grained sentence modeling using a convolutional neural
network and a long short-term memory model, combined with a specific
fine-grained word-level similarity matching model. Our experimental results
show that the proposed approach outperforms existing state-of-the-art
approaches on user-generated noisy social media data, such as Twitter texts,
and achieves highly competitive performance on a cleaner corpus
SemEval-2017 Task 1: semantic textual similarity - multilingual and cross-lingual focused evaluation
Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017)
Automatic Stance Detection Using End-to-End Memory Networks
We present a novel end-to-end memory network for stance detection, which
jointly (i) predicts whether a document agrees, disagrees, discusses or is
unrelated with respect to a given target claim, and also (ii) extracts snippets
of evidence for that prediction. The network operates at the paragraph level
and integrates convolutional and recurrent neural networks, as well as a
similarity matrix as part of the overall architecture. The experimental
evaluation on the Fake News Challenge dataset shows state-of-the-art
performance.Comment: NAACL-2018; Stance detection; Fact-Checking; Veracity; Memory
networks; Neural Networks; Distributed Representation
Multi-Perspective Fusion Network for Commonsense Reading Comprehension
Commonsense Reading Comprehension (CRC) is a significantly challenging task,
aiming at choosing the right answer for the question referring to a narrative
passage, which may require commonsense knowledge inference. Most of the
existing approaches only fuse the interaction information of choice, passage,
and question in a simple combination manner from a \emph{union} perspective,
which lacks the comparison information on a deeper level. Instead, we propose a
Multi-Perspective Fusion Network (MPFN), extending the single fusion method
with multiple perspectives by introducing the \emph{difference} and
\emph{similarity} fusion\deleted{along with the \emph{union}}. More
comprehensive and accurate information can be captured through the three types
of fusion. We design several groups of experiments on MCScript dataset
\cite{Ostermann:LREC18:MCScript} to evaluate the effectiveness of the three
types of fusion respectively. From the experimental results, we can conclude
that the difference fusion is comparable with union fusion, and the similarity
fusion needs to be activated by the union fusion. The experimental result also
shows that our MPFN model achieves the state-of-the-art with an accuracy of
83.52\% on the official test set
Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media
To what extent user's stance towards a given topic could be inferred? Most of
the studies on stance detection have focused on analysing user's posts on a
given topic to predict the stance. However, the stance in social media can be
inferred from a mixture of signals that might reflect user's beliefs including
posts and online interactions. This paper examines various online features of
users to detect their stance towards different topics. We compare multiple set
of features, including on-topic content, network interactions, user's
preferences, and online network connections. Our objective is to understand the
online signals that can reveal the users' stance. Experimentation is applied on
tweets dataset from the SemEval stance detection task, which covers five
topics. Results show that stance of a user can be detected with multiple
signals of user's online activity, including their posts on the topic, the
network they interact with or follow, the websites they visit, and the content
they like. The performance of the stance modelling using different network
features are comparable with the state-of-the-art reported model that used
textual content only. In addition, combining network and content features leads
to the highest reported performance to date on the SemEval dataset with
F-measure of 72.49%. We further present an extensive analysis to show how these
different set of features can reveal stance. Our findings have distinct privacy
implications, where they highlight that stance is strongly embedded in user's
online social network that, in principle, individuals can be profiled from
their interactions and connections even when they do not post about the topic.Comment: Accepted as a full paper at CSCW 2019. Please cite the CSCW versio
Stance detection on social media: State of the art and trends
Stance detection on social media is an emerging opinion mining paradigm for
various social and political applications in which sentiment analysis may be
sub-optimal. There has been a growing research interest for developing
effective methods for stance detection methods varying among multiple
communities including natural language processing, web science, and social
computing. This paper surveys the work on stance detection within those
communities and situates its usage within current opinion mining techniques in
social media. It presents an exhaustive review of stance detection techniques
on social media, including the task definition, different types of targets in
stance detection, features set used, and various machine learning approaches
applied. The survey reports state-of-the-art results on the existing benchmark
datasets on stance detection, and discusses the most effective approaches. In
addition, this study explores the emerging trends and different applications of
stance detection on social media. The study concludes by discussing the gaps in
the current existing research and highlights the possible future directions for
stance detection on social media.Comment: We request withdrawal of this article sincerely. We will re-edit this
paper. Please withdraw this article before we finish the new versio
- …