1,299 research outputs found

    Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop

    Full text link
    The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category

    From simple predicators to clausal functors : The english modals through time and the primitives of modality

    Get PDF
    The ultimate goal of this paper is to find a representation of modality compatible with some basic conditions on the syntax-semantic interface. Such conditions are anchored, for instance, in Chomsky's (1995) principle of full interpretation (FI). Abstract interpretation of modality is, however - be it "only" in semantic terms - already a hard nut to crack, way too vast to be dealt with in any comprehensive way here. What is pursued instead is a case-study-centered analysis. The case in point are the English modals (EM) viewed in their development through time - a locus classicus for a number of linguistic theories and frameworks. The idea will be to start out from two lines of research - continuous grammaticalization vs. cataclysmic change - and to explain some of their incongruities. The first non-trivial point here consists in deriving more fundamental questions from this research. The second, possibly even less trivial one consists in answering them. Specifically, I will argue that regardless of the actual numerical rate of change, there is an underlying and more structured way to account for the notions of change and continuity within the modal system, respectively

    A Type-coherent, Expressive Representation as an Initial Step to Language Understanding

    Full text link
    A growing interest in tasks involving language understanding by the NLP community has led to the need for effective semantic parsing and inference. Modern NLP systems use semantic representations that do not quite fulfill the nuanced needs for language understanding: adequately modeling language semantics, enabling general inferences, and being accurately recoverable. This document describes underspecified logical forms (ULF) for Episodic Logic (EL), which is an initial form for a semantic representation that balances these needs. ULFs fully resolve the semantic type structure while leaving issues such as quantifier scope, word sense, and anaphora unresolved; they provide a starting point for further resolution into EL, and enable certain structural inferences without further resolution. This document also presents preliminary results of creating a hand-annotated corpus of ULFs for the purpose of training a precise ULF parser, showing a three-person pairwise interannotator agreement of 0.88 on confident annotations. We hypothesize that a divide-and-conquer approach to semantic parsing starting with derivation of ULFs will lead to semantic analyses that do justice to subtle aspects of linguistic meaning, and will enable construction of more accurate semantic parsers.Comment: Accepted for publication at The 13th International Conference on Computational Semantics (IWCS 2019

    Producing power-law distributions and damping word frequencies with two-stage language models

    Get PDF
    Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-law distribution in the frequencies of word tokens. We present a framework for developing statisticalmodels that can generically produce power laws, breaking generativemodels into two stages. The first stage, the generator, can be any standard probabilistic model, while the second stage, the adaptor, transforms the word frequencies of this model to provide a closer match to natural language. We show that two commonly used Bayesian models, the Dirichlet-multinomial model and the Dirichlet process, can be viewed as special cases of our framework. We discuss two stochastic processes-the Chinese restaurant process and its two-parameter generalization based on the Pitman-Yor process-that can be used as adaptors in our framework to produce power-law distributions over word frequencies. We show that these adaptors justify common estimation procedures based on logarithmic or inverse-power transformations of empirical frequencies. In addition, taking the Pitman-Yor Chinese restaurant process as an adaptor justifies the appearance of type frequencies in formal analyses of natural language and improves the performance of a model for unsupervised learning of morphology.48 page(s

    Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model

    Full text link
    Large language models (LLMs) have recently reached an impressive level of linguistic capability, prompting comparisons with human language skills. However, there have been relatively few systematic inquiries into the linguistic capabilities of the latest generation of LLMs, and those studies that do exist (i) ignore the remarkable ability of humans to generalize, (ii) focus only on English, and (iii) investigate syntax or semantics and overlook other capabilities that lie at the heart of human language, like morphology. Here, we close these gaps by conducting the first rigorous analysis of the morphological capabilities of ChatGPT in four typologically varied languages (specifically, English, German, Tamil, and Turkish). We apply a version of Berko's (1958) wug test to ChatGPT, using novel, uncontaminated datasets for the four examined languages. We find that ChatGPT massively underperforms purpose-built systems, particularly in English. Overall, our results -- through the lens of morphology -- cast a new light on the linguistic capabilities of ChatGPT, suggesting that claims of human-like language skills are premature and misleading.Comment: EMNLP 202

    Backtracking Counterfactuals Revisited

    Get PDF
    I discuss three observations about backtracking counterfactuals not predicted by existing theories, and then motivate a theory of counterfactuals that does predict them. On my theory, counterfactuals quantify over a suitably restricted set of historical possibilities from some contextually relevant past time. I motivate each feature of the theory relevant to predicting our three observations about backtracking counterfactuals. The paper concludes with replies to three potential objections

    A semantic theory of a subset of qualifying "as" phrases in English

    Full text link
    Landman (1989) introduced contemporary linguistics to the as-phrase. An as-phrase is a qualifier, introduced in English by "as." "John is corrupt as a judge," for instance, contains the as-phrase "as a judge." Philosophical discourse is full of examples of as-phrase sentences. Their presence can make it difficult to distinguish valid from invalid arguments, a perennial concern for philosophers. Landman proposed the first formal semantic theory of as-phrases, based on a set of seven intuitively-valid patterns of inference involving as-phrases. Szabó (2003), Jaeger (2003), Asher (2011) each attempt to improve upon Landman's theory. Chapter 1 reviews and criticizes a temporal account of as-phrase semantics, while tracing some precedents and motivations for my approach. Chapters 2-3 criticize Szabó's and Asher's theories. Szabó's theory shows problems handling the future tense and intensional contexts. Asher's complex theory solves these problems, but resorts to the obscure notions of relative identity and bare particulars. Chapter 4 argues that neither Szabó's nor Asher's theory is clearly superior, because implicitly, they focus on different classes of sentences, which I call "Type A" and "Type B." From John Bowers' syntactic research, I argue that the element common to Type A and Type B is Pr, a predication head pronounced "as" in some contexts. Chapter 5 develops a formal semantic theory tailored to Type A sentences that solves the problems of Szabó's theory while avoiding Asher's assumptions. On my approach, the semantic properties of Type A sentences resolve into an interaction among generic quantifiers, determiner-phrase interpretation, and one core quantifier based on a principal ultrafilter. It is the interaction-effects of these elements that give rise to the many unusual readings we find in these as-phrase sentences. This result supports my motivating view that linguistic research helps to solve semantic problems of philosophical interest
    corecore