An Inheritance-Based Theory of the Lexicon in Combinatory Categorial Grammar

Abstract

Institute for Communicating and Collaborative SystemsThis thesis proposes an extended version of the Combinatory Categorial Grammar (CCG) formalism, with the following features: 1. grammars incorporate inheritance hierarchies of lexical types, defined over a simple, feature-based constraint language 2. CCG lexicons are, or at least can be, functions from forms to these lexical types This formalism, which I refer to as ‘inheritance-driven’ CCG (I-CCG), is conceptualised as a partially model-theoretic system, involving a distinction between category descriptions and their underlying category models, with these two notions being related by logical satisfaction. I argue that the I-CCG formalism retains all the advantages of both the core CCG framework and proposed generalisations involving such things as multiset categories, unary modalities or typed feature structures. In addition, I-CCG: 1. provides non-redundant lexicons for human languages 2. captures a range of well-known implicational word order universals in terms of an acquisition-based preference for shorter grammars This thesis proceeds as follows: Chapter 2 introduces the ‘baseline’ CCG formalism, which incorporates just the essential elements of category notation, without any of the proposed extensions. Chapter 3 reviews parts of the CCG literature dealing with linguistic competence in its most general sense, showing how the formalism predicts a number of language universals in terms of either its restricted generative capacity or the prioritisation of simpler lexicons. Chapter 4 analyses the first motivation for generalising the baseline category notation, demonstrating how certain fairly simple implicational word order universals are not formally predicted by baseline CCG, although they intuitively do involve considerations of grammatical economy. Chapter 5 examines the second motivation underlying many of the customised CCG category notations — to reduce lexical redundancy, thus allowing for the construction of lexicons which assign (each sense of) open class words and morphemes to no more than one lexical category, itself denoted by a non-composite lexical type. Chapter 6 defines the I-CCG formalism, incorporating into the notion of a CCG grammar both a type hierarchy of saturated category symbols and an inheritance hierarchy of constrained lexical types. The constraint language is a simple, feature-based, highly underspecified notation, interpreted against an underlying notion of category models — this latter point is crucial, since it allows us to abstract away from any particular inference procedure and focus on the category notation itself. I argue that the partially model-theoretic I-CCG formalism solves the lexical redundancy problem fairly definitively, thereby subsuming all the other proposed variant category notations. Chapter 7 demonstrates that the I-CCG formalism also provides the beginnings of a theory of the CCG lexicon in a stronger sense — with just a small number of substantive assumptions about types, it can be shown to formally predict many implicational word order universals in terms of an acquisition-based preference for simpler lexical inheritance hierarchies, i.e. those with fewer types and fewer constraints. Chapter 8 concludes the thesis

    Similar works