3 research outputs found
Bullshit, Pragmatic Deception, and Natural Language Processing
Fact checking and fake news detection has garnered increasing interest within the natural language processing (NLP) community in recent years, yet other aspects of misinformation remain unexplored. One such phenomenon is `bullshit', which different disciplines have tried to define since it first entered academic discussion nearly four decades ago. Fact checking bullshitters is useless, because factual reality typically plays no part in their assertions: Where liars deceive about content, bullshitters deceive about their goals. Bullshitting is misleading about language itself, which necessitates identifying the points at which pragmatic conventions are broken with deceptive intent. This paper aims to introduce bullshitology into the field of NLP by tying it to questions in a QUD-based definition, providing two approaches to bullshit annotation, and finally outlining which combinations of NLP methods will be helpful to classify which kinds of linguistic bullshit
Recommended from our members
Building robust and modular question answering systems
Over the past few years, significant progress has been made in QA systems due to the availability of annotated datasets on a large scale and the impressive advancements in large-scale pre-trained language models. Despite these successes, the black-box nature of end-to-end trained QA systems makes them hard to interpret and control. When these systems encounter inputs that deviate from their training data distribution or are subjected to adversarial perturbations, their performance tends to deteriorate by a large margin. Furthermore, they may occasionally produce unanticipated results, potentially leading to confusion among users. Additionally, this deficiency in robustness and interpretability poses challenges when deploying such models in real-world scenarios.
In this dissertation, we aim to build robust QA systems by explicitly decomposing various QA tasks into distinct sub-modules, each responsible for a particular aspect of the overall QA process. Through this decomposition, we seek to achieve improved performance in terms of both the system's ability to handle diverse and challenging inputs (robustness) and its capacity to provide transparent and explainable reasoning (interpretability).
To address the aforementioned limitations, in this dissertation, we aim to build robust QA models by explicitly decomposing different QA tasks into different sub-modules. We argue that utilizing these sub-modules can substantially improve the robustness and interpretability of different QA systems. In the first half of this dissertation, we introduce three sub-modules to mitigate the dataset artifacts that models learn from datasets. These sub-modules also enable us to examine and exert explicit control over the intermediate outputs. In the first work, to address question answering that requires multi-hop reasoning, we propose a chain extractor, which extracts the reasoning chains necessary for models to derive the final answer. The reasoning chains not only prevent the model from exploiting reasoning shortcuts but also provide an explanation of how the answer is derived. In the second work, we incorporate an alignment layer between the question and the context before generating the answer. This alignment layer can help us interpret the models' behavior and improve the robustness of adversarial settings. In the third work, we add an answer verifier after QA models generate the answer. This verifier can boost QA models' prediction confidence across several different domains and help us spot cases where QA models predict the right answer for the wrong reason by utilizing the external NLI datasets and models.
In the second half of this dissertation, we tackle the problem of complex fact-checking in the real world by treating it as a modularized QA task. We first decompose a complex claim into several yes-no subquestions whose answer directly contributes to the veracity of the claim. Then, each sub-question is fed into a commercial search engine to retrieve relevant documents. Additionally, we extract the relevant snippets in the retrieved documents and use a GPT3-based summarizer to generate the core evidence for checking the claim. We show that the decompositions can play an important role in both evidence retrieval and veracity composition of an explainable fact-checking system. Also, we show the GPT3-based evidence summarizer generates faithful summaries of documents most of the time indicating it can be used as an
effective part of the pipeline. Moreover, we annotate a dataset -- ClaimDecomp, containing 1,200 complex claims and the decompositions. We believe that this dataset can further promote building explainable fact-checking systems and analyzing complex claims in the real world.Computer Science
Towards Explainable Fact Checking
The past decade has seen a substantial rise in the amount of mis- and
disinformation online, from targeted disinformation campaigns to influence
politics, to the unintentional spreading of misinformation about public health.
This development has spurred research in the area of automatic fact checking,
from approaches to detect check-worthy claims and determining the stance of
tweets towards claims, to methods to determine the veracity of claims given
evidence documents. These automatic methods are often content-based, using
natural language processing methods, which in turn utilise deep neural networks
to learn higher-order features from text in order to make predictions. As deep
neural networks are black-box models, their inner workings cannot be easily
explained. At the same time, it is desirable to explain how they arrive at
certain decisions, especially if they are to be used for decision making. While
this has been known for some time, the issues this raises have been exacerbated
by models increasing in size, and by EU legislation requiring models to be used
for decision making to provide explanations, and, very recently, by legislation
requiring online platforms operating in the EU to provide transparent reporting
on their services. Despite this, current solutions for explainability are still
lacking in the area of fact checking. This thesis presents my research on
automatic fact checking, including claim check-worthiness detection, stance
detection and veracity prediction. Its contributions go beyond fact checking,
with the thesis proposing more general machine learning solutions for natural
language processing in the area of learning with limited labelled data.
Finally, the thesis presents some first solutions for explainable fact
checking.Comment: Thesis presented to the University of Copenhagen Faculty of Science
in partial fulfillment of the requirements for the degree of Doctor
Scientiarum (Dr. Scient.