390 research outputs found
A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models
We investigate the extent to which compositional vector space models can be
used to account for scope ambiguity in quantified sentences (of the form "Every
man loves some woman"). Such sentences containing two quantifiers introduce two
readings, a direct scope reading and an inverse scope reading. This ambiguity
has been treated in a vector space model using bialgebras by (Hedges and
Sadrzadeh, 2016) and (Sadrzadeh, 2016), though without an explanation of the
mechanism by which the ambiguity arises. We combine a polarised focussed
sequent calculus for the non-associative Lambek calculus NL, as described in
(Moortgat and Moot, 2011), with the vector based approach to quantifier scope
ambiguity. In particular, we establish a procedure for obtaining a vector space
model for quantifier scope ambiguity in a derivational way.Comment: This is a preprint of a paper to appear in: Journal of Language
Modelling, 201
Lambek vs. Lambek: Functorial Vector Space Semantics and String Diagrams for Lambek Calculus
The Distributional Compositional Categorical (DisCoCat) model is a
mathematical framework that provides compositional semantics for meanings of
natural language sentences. It consists of a computational procedure for
constructing meanings of sentences, given their grammatical structure in terms
of compositional type-logic, and given the empirically derived meanings of
their words. For the particular case that the meaning of words is modelled
within a distributional vector space model, its experimental predictions,
derived from real large scale data, have outperformed other empirically
validated methods that could build vectors for a full sentence. This success
can be attributed to a conceptually motivated mathematical underpinning, by
integrating qualitative compositional type-logic and quantitative modelling of
meaning within a category-theoretic mathematical framework.
The type-logic used in the DisCoCat model is Lambek's pregroup grammar.
Pregroup types form a posetal compact closed category, which can be passed, in
a functorial manner, on to the compact closed structure of vector spaces,
linear maps and tensor product. The diagrammatic versions of the equational
reasoning in compact closed categories can be interpreted as the flow of word
meanings within sentences. Pregroups simplify Lambek's previous type-logic, the
Lambek calculus, which has been extensively used to formalise and reason about
various linguistic phenomena. The apparent reliance of the DisCoCat on
pregroups has been seen as a shortcoming. This paper addresses this concern, by
pointing out that one may as well realise a functorial passage from the
original type-logic of Lambek, a monoidal bi-closed category, to vector spaces,
or to any other model of meaning organised within a monoidal bi-closed
category. The corresponding string diagram calculus, due to Baez and Stay, now
depicts the flow of word meanings.Comment: 29 pages, pending publication in Annals of Pure and Applied Logi
Lexical and Derivational Meaning in Vector-Based Models of Relativisation
Sadrzadeh et al (2013) present a compositional distributional analysis of
relative clauses in English in terms of the Frobenius algebraic structure of
finite dimensional vector spaces. The analysis relies on distinct type
assignments and lexical recipes for subject vs object relativisation. The
situation for Dutch is different: because of the verb final nature of Dutch,
relative clauses are ambiguous between a subject vs object relativisation
reading. Using an extended version of Lambek calculus, we present a
compositional distributional framework that accounts for this derivational
ambiguity, and that allows us to give a single meaning recipe for the relative
pronoun reconciling the Frobenius semantics with the demands of Dutch
derivational syntax.Comment: 10 page version to appear in Proceedings Amsterdam Colloquium,
updated with appendi
A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras
Categorical compositional distributional semantics is a model of natural
language; it combines the statistical vector space models of words with the
compositional models of grammar. We formalise in this model the generalised
quantifier theory of natural language, due to Barwise and Cooper. The
underlying setting is a compact closed category with bialgebras. We start from
a generative grammar formalisation and develop an abstract categorical
compositional semantics for it, then instantiate the abstract setting to sets
and relations and to finite dimensional vector spaces and linear maps. We prove
the equivalence of the relational instantiation to the truth theoretic
semantics of generalised quantifiers. The vector space instantiation formalises
the statistical usages of words and enables us to, for the first time, reason
about quantified phrases and sentences compositionally in distributional
semantics
A Frobenius Algebraic Analysis for Parasitic Gaps
The interpretation of parasitic gaps is an ostensible case of non-linearity
in natural language composition. Existing categorial analyses, both in the
typelogical and in the combinatory traditions, rely on explicit forms of
syntactic copying. We identify two types of parasitic gapping where the
duplication of semantic content can be confined to the lexicon. Parasitic gaps
in adjuncts are analysed as forms of generalized coordination with a
polymorphic type schema for the head of the adjunct phrase. For parasitic gaps
affecting arguments of the same predicate, the polymorphism is associated with
the lexical item that introduces the primary gap. Our analysis is formulated in
terms of Lambek calculus extended with structural control modalities. A
compositional translation relates syntactic types and derivations to the
interpreting compact closed category of finite dimensional vector spaces and
linear maps with Frobenius algebras over it. When interpreted over the
necessary semantic spaces, the Frobenius algebras provide the tools to model
the proposed instances of lexical polymorphism.Comment: SemSpace 2019, to appear in Journal of Applied Logic
Mathematical Foundations for a Compositional Distributional Model of Meaning
We propose a mathematical framework for a unification of the distributional
theory of meaning in terms of vector space models, and a compositional theory
for grammatical types, for which we rely on the algebra of Pregroups,
introduced by Lambek. This mathematical framework enables us to compute the
meaning of a well-typed sentence from the meanings of its constituents.
Concretely, the type reductions of Pregroups are `lifted' to morphisms in a
category, a procedure that transforms meanings of constituents into a meaning
of the (well-typed) whole. Importantly, meanings of whole sentences live in a
single space, independent of the grammatical structure of the sentence. Hence
the inner-product can be used to compare meanings of arbitrary sentences, as it
is for comparing the meanings of words in the distributional model. The
mathematical structure we employ admits a purely diagrammatic calculus which
exposes how the information flows between the words in a sentence in order to
make up the meaning of the whole sentence. A variation of our `categorical
model' which involves constraining the scalars of the vector spaces to the
semiring of Booleans results in a Montague-style Boolean-valued semantics.Comment: to appea
- …