1,299 research outputs found
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
From simple predicators to clausal functors : The english modals through time and the primitives of modality
The ultimate goal of this paper is to find a representation of modality compatible with some basic conditions on the syntax-semantic interface. Such conditions are anchored, for instance, in Chomsky's (1995) principle of full interpretation (FI). Abstract interpretation of modality is, however - be it "only" in semantic terms - already a hard nut to crack, way too vast to be dealt with in any comprehensive way here. What is pursued instead is a case-study-centered analysis. The case in point are the English modals (EM) viewed in their development through time - a locus classicus for a number of linguistic theories and frameworks. The idea will be to start out from two lines of research - continuous grammaticalization vs. cataclysmic change - and to explain some of their incongruities. The first non-trivial point here consists in deriving more fundamental questions from this research. The second, possibly even less trivial one consists in answering them. Specifically, I will argue that regardless of the actual numerical rate of change, there is an underlying and more structured way to account for the notions of change and continuity within the modal system, respectively
A Type-coherent, Expressive Representation as an Initial Step to Language Understanding
A growing interest in tasks involving language understanding by the NLP
community has led to the need for effective semantic parsing and inference.
Modern NLP systems use semantic representations that do not quite fulfill the
nuanced needs for language understanding: adequately modeling language
semantics, enabling general inferences, and being accurately recoverable. This
document describes underspecified logical forms (ULF) for Episodic Logic (EL),
which is an initial form for a semantic representation that balances these
needs. ULFs fully resolve the semantic type structure while leaving issues such
as quantifier scope, word sense, and anaphora unresolved; they provide a
starting point for further resolution into EL, and enable certain structural
inferences without further resolution. This document also presents preliminary
results of creating a hand-annotated corpus of ULFs for the purpose of training
a precise ULF parser, showing a three-person pairwise interannotator agreement
of 0.88 on confident annotations. We hypothesize that a divide-and-conquer
approach to semantic parsing starting with derivation of ULFs will lead to
semantic analyses that do justice to subtle aspects of linguistic meaning, and
will enable construction of more accurate semantic parsers.Comment: Accepted for publication at The 13th International Conference on
Computational Semantics (IWCS 2019
Producing power-law distributions and damping word frequencies with two-stage language models
Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-law distribution in the frequencies of word tokens. We present a framework for developing statisticalmodels that can generically produce power laws, breaking generativemodels into two stages. The first stage, the generator, can be any standard probabilistic model, while the second stage, the adaptor, transforms the word frequencies of this model to provide a closer match to natural language. We show that two commonly used Bayesian models, the Dirichlet-multinomial model and the Dirichlet process, can be viewed as special cases of our framework. We discuss two stochastic processes-the Chinese restaurant process and its two-parameter generalization based on the Pitman-Yor process-that can be used as adaptors in our framework to produce power-law distributions over word frequencies. We show that these adaptors justify common estimation procedures based on logarithmic or inverse-power transformations of empirical frequencies. In addition, taking the Pitman-Yor Chinese restaurant process as an adaptor justifies the appearance of type frequencies in formal analyses of natural language and improves the performance of a model for unsupervised learning of morphology.48 page(s
Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model
Large language models (LLMs) have recently reached an impressive level of
linguistic capability, prompting comparisons with human language skills.
However, there have been relatively few systematic inquiries into the
linguistic capabilities of the latest generation of LLMs, and those studies
that do exist (i) ignore the remarkable ability of humans to generalize, (ii)
focus only on English, and (iii) investigate syntax or semantics and overlook
other capabilities that lie at the heart of human language, like morphology.
Here, we close these gaps by conducting the first rigorous analysis of the
morphological capabilities of ChatGPT in four typologically varied languages
(specifically, English, German, Tamil, and Turkish). We apply a version of
Berko's (1958) wug test to ChatGPT, using novel, uncontaminated datasets for
the four examined languages. We find that ChatGPT massively underperforms
purpose-built systems, particularly in English. Overall, our results -- through
the lens of morphology -- cast a new light on the linguistic capabilities of
ChatGPT, suggesting that claims of human-like language skills are premature and
misleading.Comment: EMNLP 202
Recommended from our members
Learning Input Strictly Local Functions: Comparing Approaches with Catalan Adjectives
Backtracking Counterfactuals Revisited
I discuss three observations about backtracking counterfactuals not predicted by existing theories, and then motivate a theory of counterfactuals that does predict them. On my theory, counterfactuals quantify over a suitably restricted set of historical possibilities from some contextually relevant past time. I motivate each feature of the theory relevant to predicting our three observations about backtracking counterfactuals. The paper concludes with replies to three potential objections
A semantic theory of a subset of qualifying "as" phrases in English
Landman (1989) introduced contemporary linguistics to the as-phrase. An as-phrase
is a qualifier, introduced in English by "as." "John is corrupt as a judge," for instance,
contains the as-phrase "as a judge." Philosophical discourse is full of examples of
as-phrase sentences. Their presence can make it difficult to distinguish valid from
invalid arguments, a perennial concern for philosophers. Landman proposed the first
formal semantic theory of as-phrases, based on a set of seven intuitively-valid patterns
of inference involving as-phrases. Szabó (2003), Jaeger (2003), Asher (2011) each
attempt to improve upon Landman's theory.
Chapter 1 reviews and criticizes a temporal account of as-phrase semantics,
while tracing some precedents and motivations for my approach. Chapters 2-3 criticize
Szabó's and Asher's theories. Szabó's theory shows problems handling the future
tense and intensional contexts. Asher's complex theory solves these problems, but
resorts to the obscure notions of relative identity and bare particulars.
Chapter 4 argues that neither Szabó's nor Asher's theory is clearly superior, because
implicitly, they focus on different classes of sentences, which I call "Type A" and
"Type B." From John Bowers' syntactic research, I argue that the element common
to Type A and Type B is Pr, a predication head pronounced "as" in some contexts.
Chapter 5 develops a formal semantic theory tailored to Type A sentences that solves
the problems of Szabó's theory while avoiding Asher's assumptions. On my approach,
the semantic properties of Type A sentences resolve into an interaction among generic
quantifiers, determiner-phrase interpretation, and one core quantifier based on a principal
ultrafilter. It is the interaction-effects of these elements that give rise to the many
unusual readings we find in these as-phrase sentences. This result supports my motivating
view that linguistic research helps to solve semantic problems of philosophical
interest
- …