1,210 research outputs found
Causal schema induction for knowledge discovery
Making sense of familiar yet new situations typically involves making
generalizations about causal schemas, stories that help humans reason about
event sequences. Reasoning about events includes identifying cause and effect
relations shared across event instances, a process we refer to as causal schema
induction. Statistical schema induction systems may leverage structural
knowledge encoded in discourse or the causal graphs associated with event
meaning, however resources to study such causal structure are few in number and
limited in size. In this work, we investigate how to apply schema induction
models to the task of knowledge discovery for enhanced search of
English-language news texts. To tackle the problem of data scarcity, we present
Torquestra, a manually curated dataset of text-graph-schema units integrating
temporal, event, and causal structures. We benchmark our dataset on three
knowledge discovery tasks, building and evaluating models for each. Results
show that systems that harness causal structure are effective at identifying
texts sharing similar causal meaning components rather than relying on lexical
cues alone. We make our dataset and models available for research purposes.Comment: 8 pages, appendi
Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions
Generics express generalizations about the world (e.g., birds can fly) that
are not universally true (e.g., newborn birds and penguins cannot fly).
Commonsense knowledge bases, used extensively in NLP, encode some generic
knowledge but rarely enumerate such exceptions and knowing when a generic
statement holds or does not hold true is crucial for developing a comprehensive
understanding of generics. We present a novel framework informed by linguistic
theory to generate exemplars -- specific cases when a generic holds true or
false. We generate ~19k exemplars for ~650 generics and show that our framework
outperforms a strong GPT-3 baseline by 12.8 precision points. Our analysis
highlights the importance of linguistic theory-based controllability for
generating exemplars, the insufficiency of knowledge bases as a source of
exemplars, and the challenges exemplars pose for the task of natural language
inference.Comment: EACL 202
Graphene Transport at High Carrier Densities using a Polymer Electrolyte Gate
We report the study of graphene devices in Hall-bar geometry, gated with a
polymer electrolyte. High densities of 6 are
consistently reached, significantly higher than with conventional back-gating.
The mobility follows an inverse dependence on density, which can be correlated
to a dominant scattering from weak scatterers. Furthermore, our measurements
show a Bloch-Gr\"uneisen regime until 100 K (at 6.2 ),
consistent with an increase of the density. Ubiquitous in our experiments is a
small upturn in resistivity around 3 , whose origin is
discussed. We identify two potential causes for the upturn: the renormalization
of Fermi velocity and an electrochemically-enhanced scattering rate.Comment: 13 pages, 4 figures, Published Versio
Adposition and Case Supersenses v2.5: Guidelines for English
This document offers a detailed linguistic description of SNACS (Semantic
Network of Adposition and Case Supersenses; Schneider et al., 2018), an
inventory of 50 semantic labels ("supersenses") that characterize the use of
adpositions and case markers at a somewhat coarse level of granularity, as
demonstrated in the STREUSLE corpus (https://github.com/nert-gu/streusle/;
version 4.3 tracks guidelines version 2.5). Though the SNACS inventory aspires
to be universal, this document is specific to English; documentation for other
languages will be published separately.
Version 2 is a revision of the supersense inventory proposed for English by
Schneider et al. (2015, 2016) (henceforth "v1"), which in turn was based on
previous schemes. The present inventory was developed after extensive review of
the v1 corpus annotations for English, plus previously unanalyzed genitive case
possessives (Blodgett and Schneider, 2018), as well as consideration of
adposition and case phenomena in Hebrew, Hindi, Korean, and German. Hwang et
al. (2017) present the theoretical underpinnings of the v2 scheme. Schneider et
al. (2018) summarize the scheme, its application to English corpus data, and an
automatic disambiguation task
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Warning: This paper contains content that may be offensive or upsetting.
Understanding the harms and offensiveness of statements requires reasoning
about the social and situational context in which statements are made. For
example, the utterance "your English is very good" may implicitly signal an
insult when uttered by a white man to a non-white colleague, but uttered by an
ESL teacher to their student would be interpreted as a genuine compliment. Such
contextual factors have been largely ignored by previous approaches to toxic
language detection. We introduce COBRA frames, the first context-aware
formalism for explaining the intents, reactions, and harms of offensive or
biased statements grounded in their social and situational context. We create
COBRACORPUS, a dataset of 33k potentially offensive statements paired with
machine-generated contexts and free-text explanations of offensiveness, implied
biases, speaker intents, and listener reactions. To study the contextual
dynamics of offensiveness, we train models to generate COBRA explanations, with
and without access to the context. We find that explanations by
context-agnostic models are significantly worse than by context-aware ones,
especially in situations where the context inverts the statement's
offensiveness (29% accuracy drop). Our work highlights the importance and
feasibility of contextualized NLP by modeling social factors.Comment: Accepted to Findings of ACL 202
- …