569 research outputs found

    Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus

    Full text link
    The Distributional Compositional Categorical (DisCoCat) model is a mathematical framework that provides compositional semantics for meanings of natural language sentences. It consists of a computational procedure for constructing meanings of sentences, given their grammatical structure in terms of compositional type-logic, and given the empirically derived meanings of their words. For the particular case that the meaning of words is modelled within a distributional vector space model, its experimental predictions, derived from real large scale data, have outperformed other empirically validated methods that could build vectors for a full sentence. This success can be attributed to a conceptually motivated mathematical underpinning, by integrating qualitative compositional type-logic and quantitative modelling of meaning within a category-theoretic mathematical framework. The type-logic used in the DisCoCat model is Lambek's pregroup grammar. Pregroup types form a posetal compact closed category, which can be passed, in a functorial manner, on to the compact closed structure of vector spaces, linear maps and tensor product. The diagrammatic versions of the equational reasoning in compact closed categories can be interpreted as the flow of word meanings within sentences. Pregroups simplify Lambek's previous type-logic, the Lambek calculus, which has been extensively used to formalise and reason about various linguistic phenomena. The apparent reliance of the DisCoCat on pregroups has been seen as a shortcoming. This paper addresses this concern, by pointing out that one may as well realise a functorial passage from the original type-logic of Lambek, a monoidal bi-closed category, to vector spaces, or to any other model of meaning organised within a monoidal bi-closed category. The corresponding string diagram calculus, due to Baez and Stay, now depicts the flow of word meanings.Comment: 29 pages, pending publication in Annals of Pure and Applied Logi

    A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras

    Get PDF
    Categorical compositional distributional semantics is a model of natural language; it combines the statistical vector space models of words with the compositional models of grammar. We formalise in this model the generalised quantifier theory of natural language, due to Barwise and Cooper. The underlying setting is a compact closed category with bialgebras. We start from a generative grammar formalisation and develop an abstract categorical compositional semantics for it, then instantiate the abstract setting to sets and relations and to finite dimensional vector spaces and linear maps. We prove the equivalence of the relational instantiation to the truth theoretic semantics of generalised quantifiers. The vector space instantiation formalises the statistical usages of words and enables us to, for the first time, reason about quantified phrases and sentences compositionally in distributional semantics

    A Labelled Analytic Theorem Proving Environment for Categorial Grammar

    Full text link
    We present a system for the investigation of computational properties of categorial grammar parsing based on a labelled analytic tableaux theorem prover. This proof method allows us to take a modular approach, in which the basic grammar can be kept constant, while a range of categorial calculi can be captured by assigning different properties to the labelling algebra. The theorem proving strategy is particularly well suited to the treatment of categorial grammar, because it allows us to distribute the computational cost between the algorithm which deals with the grammatical types and the algebraic checker which constrains the derivation.Comment: 11 pages, LaTeX2e, uses examples.sty and a4wide.st

    Mathematical Foundations for a Compositional Distributional Model of Meaning

    Full text link
    We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.Comment: to appea

    Comparing and evaluating extended Lambek calculi

    Get PDF
    Lambeks Syntactic Calculus, commonly referred to as the Lambek calculus, was innovative in many ways, notably as a precursor of linear logic. But it also showed that we could treat our grammatical framework as a logic (as opposed to a logical theory). However, though it was successful in giving at least a basic treatment of many linguistic phenomena, it was also clear that a slightly more expressive logical calculus was needed for many other cases. Therefore, many extensions and variants of the Lambek calculus have been proposed, since the eighties and up until the present day. As a result, there is now a large class of calculi, each with its own empirical successes and theoretical results, but also each with its own logical primitives. This raises the question: how do we compare and evaluate these different logical formalisms? To answer this question, I present two unifying frameworks for these extended Lambek calculi. Both are proof net calculi with graph contraction criteria. The first calculus is a very general system: you specify the structure of your sequents and it gives you the connectives and contractions which correspond to it. The calculus can be extended with structural rules, which translate directly into graph rewrite rules. The second calculus is first-order (multiplicative intuitionistic) linear logic, which turns out to have several other, independently proposed extensions of the Lambek calculus as fragments. I will illustrate the use of each calculus in building bridges between analyses proposed in different frameworks, in highlighting differences and in helping to identify problems.Comment: Empirical advances in categorial grammars, Aug 2015, Barcelona, Spain. 201

    Coalgebraic completeness-via-canonicity for distributive substructural logics

    Get PDF
    We prove strong completeness of a range of substructural logics with respect to a natural poset-based relational semantics using a coalgebraic version of completeness-via-canonicity. By formalizing the problem in the language of coalgebraic logics, we develop a modular theory which covers a wide variety of different logics under a single framework, and lends itself to further extensions. Moreover, we believe that the coalgebraic framework provides a systematic and principled way to study the relationship between resource models on the semantics side, and substructural logics on the syntactic side.Comment: 36 page

    Context Update for Lambdas and Vectors

    Get PDF
    Vector models of language are based on the contextual aspects of words and how they co-occur in text. Truth conditional models focus on the logical aspects of language, the denotations of phrases, and their compositional properties. In the latter approach the denotation of a sentence determines its truth conditions and can be taken to be a truth value, a set of possible worlds, a context change potential, or similar. In this short paper, we develop a vector semantics for language based on the simply typed lambda calculus. Our semantics uses techniques familiar from the truth conditional tradition and is based on a form of dynamic interpretation inspired by Heim's context updates

    A Polynomial-Time Algorithm for the Lambek Calculus with Brackets of Bounded Order

    Get PDF
    Lambek calculus is a logical foundation of categorial grammar, a linguistic paradigm of grammar as logic and parsing as deduction. Pentus (2010) gave a polynomial-time algorithm for determining provability of bounded depth formulas in L*, the Lambek calculus with empty antecedents allowed. Pentus\u27 algorithm is based on tabularisation of proof nets. Lambek calculus with brackets is a conservative extension of Lambek calculus with bracket modalities, suitable for the modeling of syntactical domains. In this paper we give an algorithm for provability in Lb*, the Lambek calculus with brackets allowing empty antecedents. Our algorithm runs in polynomial time when both the formula depth and the bracket nesting depth are bounded. It combines a Pentus-style tabularisation of proof nets with an automata-theoretic treatment of bracketing
    • …
    corecore