100 research outputs found
Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus
The Distributional Compositional Categorical (DisCoCat) model is a
mathematical framework that provides compositional semantics for meanings of
natural language sentences. It consists of a computational procedure for
constructing meanings of sentences, given their grammatical structure in terms
of compositional type-logic, and given the empirically derived meanings of
their words. For the particular case that the meaning of words is modelled
within a distributional vector space model, its experimental predictions,
derived from real large scale data, have outperformed other empirically
validated methods that could build vectors for a full sentence. This success
can be attributed to a conceptually motivated mathematical underpinning, by
integrating qualitative compositional type-logic and quantitative modelling of
meaning within a category-theoretic mathematical framework.
The type-logic used in the DisCoCat model is Lambek's pregroup grammar.
Pregroup types form a posetal compact closed category, which can be passed, in
a functorial manner, on to the compact closed structure of vector spaces,
linear maps and tensor product. The diagrammatic versions of the equational
reasoning in compact closed categories can be interpreted as the flow of word
meanings within sentences. Pregroups simplify Lambek's previous type-logic, the
Lambek calculus, which has been extensively used to formalise and reason about
various linguistic phenomena. The apparent reliance of the DisCoCat on
pregroups has been seen as a shortcoming. This paper addresses this concern, by
pointing out that one may as well realise a functorial passage from the
original type-logic of Lambek, a monoidal bi-closed category, to vector spaces,
or to any other model of meaning organised within a monoidal bi-closed
category. The corresponding string diagram calculus, due to Baez and Stay, now
depicts the flow of word meanings.Comment: 29 pages, pending publication in Annals of Pure and Applied Logi
Grammatical structures and logical deductions
The three essays presented here concern natural connections between grammatical derivations and structures provided by certain standard grammar formalisms, on the one hand, and deductions in logical systems, on the other hand. In the first essay we analyse the adequacy of Polish notation for higher-order languages. The Ajdukiewicz algorithm (Ajdukiewicz 1935) is discussed in terms of generalized MP-deductions. We exhibit a failure in Ajdukiewicz’s original version of the algorithm and give a correct one; we prove that generalized MP-deductions have the frontier property, which is essential for the plausibility of Polish notation. The second essay deals with logical systems corresponding to different grammar formalisms, as e.g. Finite State Acceptors, Context-Free Grammars, Categorial Grammars, and others. We show how can logical methods be used to establish certain linguistically significant properties of formal grammars. The third essay discusses the interplay between Natural Deduction proofs in grammar oriented logics and semantic structures expressible by typed lambda terms and combinators
Reasoning with Polarity in Categorial Type Logic
The research presented in this thesis follows the parsing as deduction approach to lin-
guistics. We use the tools of Categorial Type Logic (CTL) to study the interface of
natural language syntax and semantics. Our aim is to investigate the mathematical
structure of CTL and explore the possibilities it offers for analyzing natural language
structures and their interpretation.
The thesis is divided into three parts. Each of them has an introductory chapter.
In Chapter 1, we introduce the background assumptions of the categorial approach in
linguistics, and we sketch the developments that have led to the introduction of CTL.
We discuss the motivation for using logical methods in linguistic analysis. In Chapter 3,
we propose our view on the use of unary modalities as `logical features'. In Chapter 5,
we set up a general notion of grammatical composition taking into account the form
and the meaning dimensions of linguistic expressions. We develop a logical theory of
licensing and antilicensing relations that cross-cuts the form and meaning dimensions.
Throughout the thesis we focus attention on polarity. This term refers both to the
polarity of the logical operators of CTL and to the polarity items one finds in natural
language, which, furthermore, are closely connected to natural reasoning. Therefore,
the title of this thesis Reasoning with Polarity in Categorial Type Logic is intended to
express three meanings.
Firstly, we reason with the polarity of the logical operators of CTL and study their
derivability patterns. In Chapter 2, we explore the algebraic principles that govern
the behavior of the type-forming operations of the Lambek calculus. We extend the
categorial vocabulary with downward entailing unary operations obtaining the full tool-
kit that we use in the rest of the thesis. We employ unary operators to encode and
compute monotonicity information (Chapter 4), to account for the different ways of scope
taking of generalized quantifiers (Chapter 6), and to model licensing and antilicensing
relations (Chapter 7).
Secondly, in Chapter 4, we model natural reasoning inferences drawn from structures
suitable for negative polarity item occurrences. In particular, we describe a system
of inference based on CTL. By decorating functional types with unary operators we
encode the semantic distinction between upward and downward monotone functions.
Moreover, we study the advantages of this encoding by exploring the contribution of
v
monotone functions to the study of natural reasoning and to the analysis of the syntactic
distribution of negative polarity items.
Thirdly, in Chapter 7, we study the distribution of polarity-sensitive expressions. We
show how our theory of licensing and antilicensing relations successfully differentiates
between negative polarity items, which are `attracted' by their triggers, and positive
polarity items, which are `repelled' by them. We investigate these compatibility and
incompatibility relations from a cross-linguistic perspective, and show how we reduce
distributional differences between polarity-sensitive items in Dutch, Greek and Italian
to differences in the lexical type assignments of these languages
A Study on Learnability for Rigid Lambek Grammars
We present basic notions of Gold's "learnability in the limit" paradigm, first presented in 1967, a formalization of the cognitive process by which a native speaker gets to grasp the underlying grammar of his/her own native language by being exposed to well formed sentences generated by that grammar. Then we present Lambek grammars, a formalism issued from categorial grammars which, although not as expressive as needed for a full formalization of natural languages, is particularly suited to easily implement a natural interface between syntax and semantics. In the last part of this work, we present a learnability result for Rigid Lambek grammars from structured examples
Many Valued Generalised Quantifiers for Natural Language in the DisCoCat Model
DisCoCat refers to the Categorical compositional distributional model of natural language, which combines the statistical vector space models of words with the compositional logic-based models of grammar. It is fair to say that despite existing work on incorporating notions of entailment, quantification, and coordination in this setting, a uniform modelling of logical operations is still an open problem. In this report, we take a step towards an answer. We show how one can generalise our previous DisCoCat model of generalised quantifiers from category of sets and relations to category of sets and many valued rations. As a result, we get a fuzzy version of these quantifiers. Our aim is to extend this model to all other logical connectives and develop a fuzzy logic for DisCoCat. The main contributions are showing that category of many valued relations is compact closed, defining appropriate bialgebra structures over it, and demonstrating how one can compute within this setting many valued meanings for quantified sentences.EPSRC Career Acceleration Fellowship EP/J002607/
- …