291 research outputs found

    Complexity of Grammar Induction for Quantum Types

    Full text link
    Most categorical models of meaning use a functor from the syntactic category to the semantic category. When semantic information is available, the problem of grammar induction can therefore be defined as finding preimages of the semantic types under this forgetful functor, lifting the information flow from the semantic level to a valid reduction at the syntactic level. We study the complexity of grammar induction, and show that for a variety of type systems, including pivotal and compact closed categories, the grammar induction problem is NP-complete. Our approach could be extended to linguistic type systems such as autonomous or bi-closed categories.Comment: In Proceedings QPL 2014, arXiv:1412.810

    Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus

    Full text link
    The Distributional Compositional Categorical (DisCoCat) model is a mathematical framework that provides compositional semantics for meanings of natural language sentences. It consists of a computational procedure for constructing meanings of sentences, given their grammatical structure in terms of compositional type-logic, and given the empirically derived meanings of their words. For the particular case that the meaning of words is modelled within a distributional vector space model, its experimental predictions, derived from real large scale data, have outperformed other empirically validated methods that could build vectors for a full sentence. This success can be attributed to a conceptually motivated mathematical underpinning, by integrating qualitative compositional type-logic and quantitative modelling of meaning within a category-theoretic mathematical framework. The type-logic used in the DisCoCat model is Lambek's pregroup grammar. Pregroup types form a posetal compact closed category, which can be passed, in a functorial manner, on to the compact closed structure of vector spaces, linear maps and tensor product. The diagrammatic versions of the equational reasoning in compact closed categories can be interpreted as the flow of word meanings within sentences. Pregroups simplify Lambek's previous type-logic, the Lambek calculus, which has been extensively used to formalise and reason about various linguistic phenomena. The apparent reliance of the DisCoCat on pregroups has been seen as a shortcoming. This paper addresses this concern, by pointing out that one may as well realise a functorial passage from the original type-logic of Lambek, a monoidal bi-closed category, to vector spaces, or to any other model of meaning organised within a monoidal bi-closed category. The corresponding string diagram calculus, due to Baez and Stay, now depicts the flow of word meanings.Comment: 29 pages, pending publication in Annals of Pure and Applied Logi

    Mathematical Foundations for a Compositional Distributional Model of Meaning

    Full text link
    We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.Comment: to appea

    Abstract Tensor Systems as Monoidal Categories

    Full text link
    The primary contribution of this paper is to give a formal, categorical treatment to Penrose's abstract tensor notation, in the context of traced symmetric monoidal categories. To do so, we introduce a typed, sum-free version of an abstract tensor system and demonstrate the construction of its associated category. We then show that the associated category of the free abstract tensor system is in fact the free traced symmetric monoidal category on a monoidal signature. A notable consequence of this result is a simple proof for the soundness and completeness of the diagrammatic language for traced symmetric monoidal categories.Comment: Dedicated to Joachim Lambek on the occasion of his 90th birthda

    A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras

    Get PDF
    Categorical compositional distributional semantics is a model of natural language; it combines the statistical vector space models of words with the compositional models of grammar. We formalise in this model the generalised quantifier theory of natural language, due to Barwise and Cooper. The underlying setting is a compact closed category with bialgebras. We start from a generative grammar formalisation and develop an abstract categorical compositional semantics for it, then instantiate the abstract setting to sets and relations and to finite dimensional vector spaces and linear maps. We prove the equivalence of the relational instantiation to the truth theoretic semantics of generalised quantifiers. The vector space instantiation formalises the statistical usages of words and enables us to, for the first time, reason about quantified phrases and sentences compositionally in distributional semantics

    Command injection attacks, continuations, and the Lambek calculus

    Get PDF
    This paper shows connections between command injection attacks, continuations, and the Lambek calculus: certain command injections, such as the tautology attack on SQL, are shown to be a form of control effect that can be typed using the Lambek calculus, generalizing the double-negation typing of continuations. Lambek's syntactic calculus is a logic with two implicational connectives taking their arguments from the left and right, respectively. These connectives describe how strings interact with their left and right contexts when building up syntactic structures. The calculus is a form of propositional logic without structural rules, and so a forerunner of substructural logics like Linear Logic and Separation Logic.Comment: In Proceedings WoC 2015, arXiv:1606.0583
    • …
    corecore