Search CORE

15 research outputs found

A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras

Author: Hedges Jules
Sadrzadeh Mehrnoosh
Publication venue
Publication date: 01/06/2016
Field of study

Categorical compositional distributional semantics is a model of natural language; it combines the statistical vector space models of words with the compositional models of grammar. We formalise in this model the generalised quantifier theory of natural language, due to Barwise and Cooper. The underlying setting is a compact closed category with bialgebras. We start from a generative grammar formalisation and develop an abstract categorical compositional semantics for it, then instantiate the abstract setting to sets and relations and to finite dimensional vector spaces and linear maps. We prove the equivalence of the relational instantiation to the truth theoretic semantics of generalised quantifiers. The vector space instantiation formalises the statistical usages of words and enables us to, for the first time, reason about quantified phrases and sentences compositionally in distributional semantics

arXiv.org e-Print Archive

Queen Mary Research Online

A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras

Author: Hedges J
Sadrzadeh M
Publication venue
Publication date: 01/06/2016
Field of study

Queen Mary Research Online

A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models

Author: Wijnholds Gijs Jasper
Publication venue
Publication date: 01/01/2018
Field of study

We investigate the extent to which compositional vector space models can be used to account for scope ambiguity in quantified sentences (of the form "Every man loves some woman"). Such sentences containing two quantifiers introduce two readings, a direct scope reading and an inverse scope reading. This ambiguity has been treated in a vector space model using bialgebras by (Hedges and Sadrzadeh, 2016) and (Sadrzadeh, 2016), though without an explanation of the mechanism by which the ambiguity arises. We combine a polarised focussed sequent calculus for the non-associative Lambek calculus NL, as described in (Moortgat and Moot, 2011), with the vector based approach to quantifier scope ambiguity. In particular, we establish a procedure for obtaining a vector space model for quantifier scope ambiguity in a derivational way.Comment: This is a preprint of a paper to appear in: Journal of Language Modelling, 201

arXiv.org e-Print Archive

Biblioteka Nauki - repozytorium artykuÅÃ³w

Many Valued Generalised Quantifiers for Natural Language in the DisCoCat Model

Author: Dostal Matej
Sadrzadeh Mehrnoosh
Publication venue
Publication date: 07/11/2016
Field of study

DisCoCat refers to the Categorical compositional distributional model of natural language, which combines the statistical vector space models of words with the compositional logic-based models of grammar. It is fair to say that despite existing work on incorporating notions of entailment, quantification, and coordination in this setting, a uniform modelling of logical operations is still an open problem. In this report, we take a step towards an answer. We show how one can generalise our previous DisCoCat model of generalised quantifiers from category of sets and relations to category of sets and many valued rations. As a result, we get a fuzzy version of these quantifiers. Our aim is to extend this model to all other logical connectives and develop a fuzzy logic for DisCoCat. The main contributions are showing that category of many valued relations is compact closed, defining appropriate bialgebra structures over it, and demonstrating how one can compute within this setting many valued meanings for quantified sentences.EPSRC Career Acceleration Fellowship EP/J002607/

Queen Mary Research Online

Translating and Evolving: Towards a Model of Language Change in DisCoCat

Author: Bradley Tai-Danae
Lewis Martha
Master Jade
Theilman Brad
Publication venue: 'Open Publishing Association'
Publication date: 08/11/2018
Field of study

The categorical compositional distributional (DisCoCat) model of meaning developed by Coecke et al. (2010) has been successful in modeling various aspects of meaning. However, it fails to model the fact that language can change. We give an approach to DisCoCat that allows us to represent language models and translations between them, enabling us to describe translations from one language to another, or changes within the same language. We unify the product space representation given in (Coecke et al., 2010) and the functorial description in (Kartsaklis et al., 2013), in a way that allows us to view a language as a catalogue of meanings. We formalize the notion of a lexicon in DisCoCat, and define a dictionary of meanings between two lexicons. All this is done within the framework of monoidal categories. We give examples of how to apply our methods, and give a concrete suggestion for compositional translation in corpora.Comment: In Proceedings CAPNS 2018, arXiv:1811.0270

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Quantization, Frobenius and Bi Algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics

Author: Sadrzadeh M
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: “categorical distributional compositional” semantics, or in short, the “DisCoCat” model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the Quantization functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results

Directory of Open Access Journals

Frontiers - Publisher Connector

Queen Mary Research Online

Sentence entailment in compositional distributional semantics

Author: B Coecke
B Coecke
Dimitri Kartsaklis
E Grefenstette
Esma Balkır
G Kelly
G Salton
H Rubenstein
H Schütze
I Dagan
J Lambek
J Mitchell
L Kotlerman
M Sadrzadeh
M Sadrzadeh
Mehrnoosh Sadrzadeh
P Selinger
PD Turney
S MacLane
Z Harris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/01/2018
Field of study

Distributional semantic models provide vector representations for words by gathering co-occurrence frequencies from corpora of text. Compositional distributional models extend these from words to phrases and sentences. In categorical compositional distributional semantics, phrase and sentence representations are functions of their grammatical structure and representations of the words therein. In this setting, grammatical structures are formalised by morphisms of a compact closed category and meanings of words are formalised by objects of the same category. These can be instantiated in the form of vectors or density matrices. This paper concerns the applications of this model to phrase and sentence level entailment. We argue that entropy-based distances of vectors and density matrices provide a good candidate to measure word-level entailment, show the advantage of density matrices over vectors for word level entailments, and prove that these distances extend compositionally from words to phrases and sentences. We exemplify our theoretical constructions on real data and a toy entailment dataset and provide preliminary experimental evidence.Comment: 8 pages, 1 figure, 2 tables, short version presented in the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 201

arXiv.org e-Print Archive

Crossref

UCL Discovery

Queen Mary Research Online

Idioms and the syntax/semantics interface of descriptive content vs. reference

Author: Gehrke Berit
McNally Louise
Publication venue: Humboldt-Universität zu Berlin
Publication date: 08/06/2019
Field of study

This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.The syntactic literature on idioms contains some proposals that are surprising from a compositional perspective. For example, there are proposals that, in the case of verb-object idioms, the verb combines directly with the noun inside its DP complement, and the determiner is introduced higher up in the syntactic structure, or is late-adjoined. This seems to violate compositionality insofar as it is generally assumed that the semantic role of the determiner is to convert a noun to the appropriate semantic type to serve as the argument to the function denoted by the verb. In this paper, we establish a connection between this line of analysis and lines of work in semantics that have developed outside of the domain of idioms, particularly work on incorporation and work that combines formal and distributional semantic modelling. This semantic work separates the composition of descriptive content from that of discourse referent introducing material; our proposal shows that this separation offers a particularly promising way to handle the compositional difficulties posed by idioms, including certain patterns of variation in intervening determiners and modifiers.Peer Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Recommended from our members

Functional Distributional Semantics: Learning Linguistically Informed Representations from a Precisely Annotated Corpus

Author: Emerson Guy
Publication venue: University of Cambridge
Publication date: 07/11/2018
Field of study

The aim of distributional semantics is to design computational techniques that can automatically learn the meanings of words from a body of text. The twin challenges are: how do we represent meaning, and how do we learn these representations? The current state of the art is to represent meanings as vectors – but vectors do not correspond to any traditional notion of meaning. In particular, there is no way to talk about truth, a crucial concept in logic and formal semantics. In this thesis, I develop a framework for distributional semantics which answers this challenge. The meaning of a word is not represented as a vector, but as a function, mapping entities (objects in the world) to probabilities of truth (the probability that the word is true of the entity). Such a function can be interpreted both in the machine learning sense of a classifier, and in the formal semantic sense of a truth-conditional function. This simultaneously allows both the use of machine learning techniques to exploit large datasets, and also the use of formal semantic techniques to manipulate the learnt representations. I define a probabilistic graphical model, which incorporates a probabilistic generalisation of model theory (allowing a strong connection with formal semantics), and which generates semantic dependency graphs (allowing it to be trained on a corpus). This graphical model provides a natural way to model logical inference, semantic composition, and context-dependent meanings, where Bayesian inference plays a crucial role. I demonstrate the feasibility of this approach by training a model on WikiWoods, a parsed version of the English Wikipedia, and evaluating it on three tasks. The results indicate that the model can learn information not captured by vector space models.Schiff Fund Studentshi

Apollo (Cambridge)