2 research outputs found
Paying Attention to Deflections: Mining Pragmatic Nuances for Whataboutism Detection in Online Discourse
Whataboutism, a potent tool for disrupting narratives and sowing distrust,
remains under-explored in quantitative NLP research. Moreover, past work has
not distinguished its use as a strategy for misinformation and propaganda from
its use as a tool for pragmatic and semantic framing. We introduce new datasets
from Twitter and YouTube, revealing overlaps as well as distinctions between
whataboutism, propaganda, and the tu quoque fallacy. Furthermore, drawing on
recent work in linguistic semantics, we differentiate the `what about' lexical
construct from whataboutism. Our experiments bring to light unique challenges
in its accurate detection, prompting the introduction of a novel method using
attention weights for negative sample mining. We report significant
improvements of 4% and 10% over previous state-of-the-art methods in our
Twitter and YouTube collections, respectively.Comment: 14 pages, 5 figure
İki taraflı sıralama problemine spark çerçevesinde gizliliği koruyan bir çözüm
Cataloged from PDF version of article.Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2017.Includes bibliographical references (leaves 50-54).The bipartite ranking problem is defined as finding a function that ranks positive
instances in a dataset higher than the negative ones. Financial and medical
domains are some of the common application areas of the ranking algorithms.
However, a common concern for such domains is the privacy of individuals or
companies in the dataset. That is, a researcher who wants to discover knowledge
from a dataset extracted from such a domain, needs to access the records of
all individuals in the dataset in order to run a ranking algorithm. This privacy
concern puts limitations on the use of sensitive personal data for such analysis. We
propose an efficient solution for the privacy-preserving bipartite ranking problem,
where the researcher does not need the raw data of the instances in order to learn
a ranking model from the data.
The RIMARC (Ranking Instances by Maximizing Area under the ROC Curve)
algorithm solves the bipartite ranking problem by learning a model to rank instances.
As part of the model, it learns a weight for each feature by analyzing the
area under receiver operating characteristic (ROC) curve. RIMARC algorithm
is shown to be more accurate and efficient than its counterparts. Thus, we use
this algorithm as a building-block and provide a privacy-preserving version of
the RIMARC algorithm using homomorphic encryption and secure multi-party
computation.
In order to increase the time efficiency for big datasets, we have implemented
privacy-preserving RIMARC algorithm on Apache Spark, which is a popular parallelization
framework with its revolutionary programming paradigm called Resilient
Distributed Datasets. Our proposed algorithm lets a data owner outsource the storage and processing
of its encrypted dataset to a semi-trusted cloud. Then, a researcher can get
the results of his/her queries (to learn the ranking function) on the dataset by
interacting with the cloud. During this process, neither the researcher nor the
cloud can access any information about the raw dataset. We prove the security
of the proposed algorithm and show its efficiency via experiments on real data.by Noushin Salek Faramarzi.M.S