109 research outputs found
Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization
Previous studies investigating the syntactic abilities of deep learning
models have not targeted the relationship between the strength of the
grammatical generalization and the amount of evidence to which the model is
exposed during training. We address this issue by deploying a novel
word-learning paradigm to test BERT's few-shot learning capabilities for two
aspects of English verbs: alternations and classes of selectional preferences.
For the former, we fine-tune BERT on a single frame in a verbal-alternation
pair and ask whether the model expects the novel verb to occur in its sister
frame. For the latter, we fine-tune BERT on an incomplete selectional network
of verbal objects and ask whether it expects unattested but plausible
verb/object pairs. We find that BERT makes robust grammatical generalizations
after just one or two instances of a novel word in fine-tuning. For the verbal
alternation tests, we find that the model displays behavior that is consistent
with a transitivity bias: verbs seen few times are expected to take direct
objects, but verbs seen with direct objects are not expected to occur
intransitively.Comment: Accepted to BlackboxNLP 202
Knowledge Management Culture Audit: Capturing Tacit Perceptions and Barriers
A firm’s capacity to efficiently create value from knowledge held by employees and embedded in processes is a key strategic resource. Knowledge Management (KM) seeks to systematically improve that capacity. The first critical step for implementing KM in organizations is the Knowledge Audit. Current audit practices use interviews and questionnaires to understand the KM processes that the organization holds and improved KM processes it wishes to implement, and to explore the organizational culture. In this paper we introduce the concept of capturing tacit cultural perceptions to identify cultural barriers that may interfere with a KM initiative. For this purpose, an analysis instrument was developed and used during the KM audit in a large international software development organization
On the Effect of Anticipation on Reading Times
Over the past two decades, numerous studies have demonstrated how less
predictable (i.e., higher surprisal) words take more time to read. In general,
these studies have implicitly assumed the reading process is purely responsive:
Readers observe a new word and allocate time to process it as required. We
argue that prior results are also compatible with a reading process that is at
least partially anticipatory: Readers could make predictions about a future
word and allocate time to process it based on their expectation. In this work,
we operationalize this anticipation as a word's contextual entropy. We assess
the effect of anticipation on reading by comparing how well surprisal and
contextual entropy predict reading times on four naturalistic reading datasets:
two self-paced and two eye-tracking. Experimentally, across datasets and
analyses, we find substantial evidence for effects of contextual entropy over
surprisal on a word's reading time (RT): in fact, entropy is sometimes better
than surprisal in predicting a word's RT. Spillover effects, however, are
generally not captured by entropy, but only by surprisal. Further, we
hypothesize four cognitive mechanisms through which contextual entropy could
impact RTs -- three of which we are able to design experiments to analyze.
Overall, our results support a view of reading that is not just responsive, but
also anticipatory.Comment: This is a pre-MIT Press publication version of the paper. Code is
available in https://github.com/rycolab/anticipation-on-reading-time
Testing the Predictions of Surprisal Theory in 11 Languages
A fundamental result in psycholinguistics is that less predictable words take
a longer time to process. One theoretical explanation for this finding is
Surprisal Theory (Hale, 2001; Levy, 2008), which quantifies a word's
predictability as its surprisal, i.e. its negative log-probability given a
context. While evidence supporting the predictions of Surprisal Theory have
been replicated widely, most have focused on a very narrow slice of data:
native English speakers reading English texts. Indeed, no comprehensive
multilingual analysis exists. We address this gap in the current literature by
investigating the relationship between surprisal and reading times in eleven
different languages, distributed across five language families. Deriving
estimates from language models trained on monolingual and multilingual corpora,
we test three predictions associated with surprisal theory: (i) whether
surprisal is predictive of reading times; (ii) whether expected surprisal, i.e.
contextual entropy, is predictive of reading times; (iii) and whether the
linking function between surprisal and reading times is linear. We find that
all three predictions are borne out crosslinguistically. By focusing on a more
diverse set of languages, we argue that these results offer the most robust
link to-date between information theory and incremental language processing
across languages.Comment: This is a pre-MIT Press publication version of the pape
- …