14 research outputs found
Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation
Determining the plausibility of causal relations between clauses is a
commonsense reasoning task that requires complex inference ability. The general
approach to this task is to train a large pretrained language model on a
specific dataset. However, the available training data for the task is often
scarce, which leads to instability of model training or reliance on the shallow
features of the dataset. This paper presents a number of techniques for making
models more robust in the domain of causal reasoning. Firstly, we perform
adversarial training by generating perturbed inputs through synonym
substitution. Secondly, based on a linguistic theory of discourse connectives,
we perform data augmentation using a discourse parser for detecting causally
linked clauses in large text, and a generative language model for generating
distractors. Both methods boost model performance on the Choice of Plausible
Alternatives (COPA) dataset, as well as on a Balanced COPA dataset, which is a
modified version of the original data that has been developed to avoid
superficial cues, leading to a more challenging benchmark. We show a
statistically significant improvement in performance and robustness on both
datasets, even with only a small number of additionally generated data points.Comment: 7 pages + pages references, 4 figures, 3 tables, paper accepted at
AAAI202
Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering
Recent developments in pre-trained neural language modeling have led to leaps
in accuracy on commonsense question-answering benchmarks. However, there is
increasing concern that models overfit to specific tasks, without learning to
utilize external knowledge or perform general semantic reasoning. In contrast,
zero-shot evaluations have shown promise as a more robust measure of a model's
general reasoning abilities. In this paper, we propose a novel neuro-symbolic
framework for zero-shot question answering across commonsense tasks. Guided by
a set of hypotheses, the framework studies how to transform various
pre-existing knowledge resources into a form that is most effective for
pre-training models. We vary the set of language models, training regimes,
knowledge sources, and data generation strategies, and measure their impact
across tasks. Extending on prior work, we devise and compare four constrained
distractor-sampling strategies. We provide empirical results across five
commonsense question-answering tasks with data generated from five external
knowledge resources. We show that, while an individual knowledge graph is
better suited for specific tasks, a global knowledge graph brings consistent
gains across different tasks. In addition, both preserving the structure of the
task as well as generating fair and informative questions help language models
learn more effectively.Comment: AAAI 202
Judgment aggregation, discursive dilemma and reflective equilibrium: Neural language models as self-improving doxastic agents
Neural language models (NLMs) are susceptible to producing inconsistent output. This paper proposes a new diagnosis as well as a novel remedy for NLMs\u27 incoherence. We train NLMs on synthetic text corpora that are created by simulating text production in a society. For diagnostic purposes, we explicitly model the individual belief systems of artificial agents (authors) who produce corpus texts. NLMs, trained on those texts, can be shown to aggregate the judgments of individual authors during pre-training according to sentence-wise vote ratios (roughly, reporting frequencies), which inevitably leads to so-called discursive dilemmas: aggregate judgments are inconsistent even though all individual belief states are consistent. As a remedy for such inconsistencies, we develop a self-training procedure—inspired by the concept of reflective equilibrium—that effectively reduces the extent of logical incoherence in a model\u27s belief system, corrects global mis-confidence, and eventually allows the model to settle on a new, epistemically superior belief state. Thus, social choice theory helps to understand why NLMs are prone to produce inconsistencies; epistemology suggests how to get rid of them
COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs
Recent years have brought about a renewed interest in commonsense
representation and reasoning in the field of natural language understanding.
The development of new commonsense knowledge graphs (CSKG) has been central to
these advances as their diverse facts can be used and referenced by machine
learning models for tackling new and challenging tasks. At the same time, there
remain questions about the quality and coverage of these resources due to the
massive scale required to comprehensively encompass general commonsense
knowledge.
In this work, we posit that manually constructed CSKGs will never achieve the
coverage necessary to be applicable in all situations encountered by NLP
agents. Therefore, we propose a new evaluation framework for testing the
utility of KGs based on how effectively implicit knowledge representations can
be learned from them.
With this new goal, we propose ATOMIC 2020, a new CSKG of general-purpose
commonsense knowledge containing knowledge that is not readily available in
pretrained language models. We evaluate its properties in comparison with other
leading CSKGs, performing the first large-scale pairwise study of commonsense
knowledge resources. Next, we show that ATOMIC 2020 is better suited for
training knowledge models that can generate accurate, representative knowledge
for new, unseen entities and events. Finally, through human evaluation, we show
that the few-shot performance of GPT-3 (175B parameters), while impressive,
remains ~12 absolute points lower than a BART-based knowledge model trained on
ATOMIC 2020 despite using over 430x fewer parameters
Recommended from our members
Generative Language Models for Personalized Information Understanding
A major challenge in information understanding stems from the diverse nature of the audience, where individuals possess varying preferences, experiences, educational and cultural backgrounds. Consequently, adopting a one-size-fits-all approach to provide information may prove suboptimal. While prior research has predominantly focused on delivering pre-existing content to users with potential interests, this thesis explores generative language models for personalized information understanding. By harnessing the potential of generative language models, our objective is to generate novel personalize content for individual users. As a result, users from diverse backgrounds can be provided with content that are tailored for their need and better aligns with their interests. The crux of this research hinges on addressing the following two aspects: 1. Personalized Content: How to harness user profiles to create tailored content for individual users; 2. Effective Communication: How to engage with users in order to proficiently convey information. For the first aspect, i.e. personalized content, we explored personalized news headline generation. By analyzing users\u27 reading history, our proposed framework identifies perspectives that users are interested in, which can further guide generating news headlines that are attractive to users. For the second aspect, i.e. effective communication, we developed personalized reading assistive agent, which assist users understand complex information in news article or academic documents through conversations. Compared to reading, obtaining information through conversations is more interactive and requires shorter attention span. We further incorporate the above aspects in personalized information systems in a real-life scenario, i.e. patient education. Specifically, we propose a novel after-visit summaries (AVS) writing assistant. After-visit summaries notes are documents given to patients to help them understand their clinical visits and disease self-management. Our approach not only automatically generates AVS drafts, but also detects potential errors in the generated drafts, allowing physicians to revise and produce AVS notes with higher efficiency and accuracy. Moreover, we present PaniniQA, a patient-centric interactive question answering system designed to help patients understand their discharge instructions. PaniniQA first identifies important clinical content from patients’ discharge instructions and then formulates personalized educational questions for distinctive patients. In addition, PaniniQA is also equipped with answer verification functionality to provide timely feedback to correct patients’ misunderstandings. Overall, we aspire to contribute to the advancement of information dissemination techniques, promoting a more inclusive and effective means of communication in our information-driven world