4,879 research outputs found
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms
In NLP, convolutional neural networks (CNNs) have benefited less than
recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that
this is because the attention in CNNs has been mainly implemented as attentive
pooling (i.e., it is applied to pooling) rather than as attentive convolution
(i.e., it is integrated into convolution). Convolution is the differentiator of
CNNs in that it can powerfully model the higher-level representation of a word
by taking into account its local fixed-size context in the input text t^x. In
this work, we propose an attentive convolution network, ATTCONV. It extends the
context scope of the convolution operation, deriving higher-level features for
a word not only from local context, but also information extracted from
nonlocal context by the attention mechanism commonly used in RNNs. This
nonlocal context can come (i) from parts of the input text t^x that are distant
or (ii) from extra (i.e., external) contexts t^y. Experiments on sentence
modeling with zero-context (sentiment analysis), single-context (textual
entailment) and multiple-context (claim verification) demonstrate the
effectiveness of ATTCONV in sentence representation learning with the
incorporation of context. In particular, attentive convolution outperforms
attentive pooling and is a strong competitor to popular attentive RNNs.Comment: Camera-ready for TACL. 16 page
Deep Neural Attention for Misinformation and Deception Detection
PhD thesis in Information technologyAt present the influence of social media on society is so much that without it life seems to have no meaning for many. This kind of over-reliance on social media gives an opportunity to the anarchic elements to take undue advantage. Online misinformation and deception are vivid examples of such phenomenon. The misinformation or fake news spreads faster and wider than the true news [32]. The need of the hour is to identify and curb the spread of misinformation and misleading content automatically at the earliest.
Several machine learning models have been proposed by the researchers to detect and prevent misinformation and deceptive content. However, these prior works suffer from some limitations: First, they either use feature engineering heavy methods or use intricate deep neural architectures, which are not so transparent in terms of their internal working and decision making. Second, they do not incorporate and learn the available auxiliary and latent cues and patterns, which can be very useful in forming the adequate context for the misinformation. Third, Most of the former methods perform poorly in early detection accuracy measures because of their reliance on features that are usually absent at the initial stage of news or social media posts on social networks.
In this dissertation, we propose suitable deep neural attention based solutions to overcome these limitations. For instance, we propose a claim verification model, which learns embddings for the latent aspects such as author and subject of the claim and domain of the external evidence document. This enables the model to learn important additional context other than the textual content. In addition, we also propose an algorithm to extract evidential snippets out of external evidence documents, which serves as explanation of the model’s decisions. Next, we improve this model by using improved claim driven attention mechanism and also generate a topically diverse and non-redundant multi-document fact-checking summary for the claims, which helps to further interpret the model’s decision making. Subsequently, we introduce a novel method to learn influence and affinity relationships among the social media users present on the propagation paths of the news items. By modeling the complex influence relationship among the users, in addition to textual content, we learn the significant patterns pertaining to the diffusion of the news item on social network. The evaluation shows that the proposed model outperforms the other related methods in early detection performance with significant gains.
Next, we propose a synthetic headline generation based headline incongruence detection model. Which uses a word-to-word mutual attention based deep semantic matching between original and synthetic news headline to detect incongruence. Further, we investigate and define a new task of incongruence detection in presence of important cardinal values in headline. For this new task, we propose a part-of-speech pattern driven attention based method, which learns requisite context for cardinal values
MUSER: A MUlti-Step Evidence Retrieval Enhancement Framework for Fake News Detection
The ease of spreading false information online enables individuals with
malicious intent to manipulate public opinion and destabilize social stability.
Recently, fake news detection based on evidence retrieval has gained popularity
in an effort to identify fake news reliably and reduce its impact. Evidence
retrieval-based methods can improve the reliability of fake news detection by
computing the textual consistency between the evidence and the claim in the
news. In this paper, we propose a framework for fake news detection based on
MUlti-Step Evidence Retrieval enhancement (MUSER), which simulates the steps of
human beings in the process of reading news, summarizing, consulting materials,
and inferring whether the news is true or fake. Our model can explicitly model
dependencies among multiple pieces of evidence, and perform multi-step
associations for the evidence required for news verification through multi-step
retrieval. In addition, our model is able to automatically collect existing
evidence through paragraph retrieval and key evidence selection, which can save
the tedious process of manual evidence collection. We conducted extensive
experiments on real-world datasets in different languages, and the results
demonstrate that our proposed model outperforms state-of-the-art baseline
methods for detecting fake news by at least 3% in F1-Macro and 4% in F1-Micro.
Furthermore, it provides interpretable evidence for end users.Comment: 12 pages, 5 figures, accepted by KDD '23, ADS trac
Semantic Representation and Inference for NLP
Semantic representation and inference is essential for Natural Language
Processing (NLP). The state of the art for semantic representation and
inference is deep learning, and particularly Recurrent Neural Networks (RNNs),
Convolutional Neural Networks (CNNs), and transformer Self-Attention models.
This thesis investigates the use of deep learning for novel semantic
representation and inference, and makes contributions in the following three
areas: creating training data, improving semantic representations and extending
inference learning. In terms of creating training data, we contribute the
largest publicly available dataset of real-life factual claims for the purpose
of automatic claim verification (MultiFC), and we present a novel inference
model composed of multi-scale CNNs with different kernel sizes that learn from
external sources to infer fact checking labels. In terms of improving semantic
representations, we contribute a novel model that captures non-compositional
semantic indicators. By definition, the meaning of a non-compositional phrase
cannot be inferred from the individual meanings of its composing words (e.g.,
hot dog). Motivated by this, we operationalize the compositionality of a phrase
contextually by enriching the phrase representation with external word
embeddings and knowledge graphs. Finally, in terms of inference learning, we
propose a series of novel deep learning architectures that improve inference by
using syntactic dependencies, by ensembling role guided attention heads,
incorporating gating layers, and concatenating multiple heads in novel and
effective ways. This thesis consists of seven publications (five published and
two under review).Comment: PhD thesis, the University of Copenhage
WSDMS: Debunk Fake News via Weakly Supervised Detection of Misinforming Sentences with Contextualized Social Wisdom
In recent years, we witness the explosion of false and unconfirmed
information (i.e., rumors) that went viral on social media and shocked the
public. Rumors can trigger versatile, mostly controversial stance expressions
among social media users. Rumor verification and stance detection are different
yet relevant tasks. Fake news debunking primarily focuses on determining the
truthfulness of news articles, which oversimplifies the issue as fake news
often combines elements of both truth and falsehood. Thus, it becomes crucial
to identify specific instances of misinformation within the articles. In this
research, we investigate a novel task in the field of fake news debunking,
which involves detecting sentence-level misinformation. One of the major
challenges in this task is the absence of a training dataset with
sentence-level annotations regarding veracity. Inspired by the Multiple
Instance Learning (MIL) approach, we propose a model called Weakly Supervised
Detection of Misinforming Sentences (WSDMS). This model only requires bag-level
labels for training but is capable of inferring both sentence-level
misinformation and article-level veracity, aided by relevant social media
conversations that are attentively contextualized with news sentences. We
evaluate WSDMS on three real-world benchmarks and demonstrate that it
outperforms existing state-of-the-art baselines in debunking fake news at both
the sentence and article levels
- …