1,804 research outputs found
A Corpus-based Toy Model for DisCoCat
The categorical compositional distributional (DisCoCat) model of meaning
rigorously connects distributional semantics and pregroup grammars, and has
found a variety of applications in computational linguistics. From a more
abstract standpoint, the DisCoCat paradigm predicates the construction of a
mapping from syntax to categorical semantics. In this work we present a
concrete construction of one such mapping, from a toy model of syntax for
corpora annotated with constituent structure trees, to categorical semantics
taking place in a category of free R-semimodules over an involutive commutative
semiring R.Comment: In Proceedings SLPCS 2016, arXiv:1608.0101
"Not not bad" is not "bad": A distributional account of negation
With the increasing empirical success of distributional models of
compositional semantics, it is timely to consider the types of textual logic
that such models are capable of capturing. In this paper, we address
shortcomings in the ability of current models to capture logical operations
such as negation. As a solution we propose a tripartite formulation for a
continuous vector space representation of semantics and subsequently use this
representation to develop a formal compositional notion of negation within such
models.Comment: 9 pages, to appear in Proceedings of the 2013 Workshop on Continuous
Vector Space Models and their Compositionalit
A Study of Entanglement in a Categorical Framework of Natural Language
In both quantum mechanics and corpus linguistics based on vector spaces, the
notion of entanglement provides a means for the various subsystems to
communicate with each other. In this paper we examine a number of
implementations of the categorical framework of Coecke, Sadrzadeh and Clark
(2010) for natural language, from an entanglement perspective. Specifically,
our goal is to better understand in what way the level of entanglement of the
relational tensors (or the lack of it) affects the compositional structures in
practical situations. Our findings reveal that a number of proposals for verb
construction lead to almost separable tensors, a fact that considerably
simplifies the interactions between the words. We examine the ramifications of
this fact, and we show that the use of Frobenius algebras mitigates the
potential problems to a great extent. Finally, we briefly examine a machine
learning method that creates verb tensors exhibiting a sufficient level of
entanglement.Comment: In Proceedings QPL 2014, arXiv:1412.810
Distributional Sentence Entailment Using Density Matrices
Categorical compositional distributional model of Coecke et al. (2010)
suggests a way to combine grammatical composition of the formal, type logical
models with the corpus based, empirical word representations of distributional
semantics. This paper contributes to the project by expanding the model to also
capture entailment relations. This is achieved by extending the representations
of words from points in meaning space to density operators, which are
probability distributions on the subspaces of the space. A symmetric measure of
similarity and an asymmetric measure of entailment is defined, where lexical
entailment is measured using von Neumann entropy, the quantum variant of
Kullback-Leibler divergence. Lexical entailment, combined with the composition
map on word representations, provides a method to obtain entailment relations
on the level of sentences. Truth theoretic and corpus-based examples are
provided.Comment: 11 page
- …