17 research outputs found
Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning
This paper aims to explore the effect of prior disambiguation on neural
network- based compositional models, with the hope that better semantic
representations for text compounds can be produced. We disambiguate the input
word vectors before they are fed into a compositional deep net. A series of
evaluations shows the positive effect of prior disambiguation for such deep
models.Comment: NIPS 201
A Study of Entanglement in a Categorical Framework of Natural Language
In both quantum mechanics and corpus linguistics based on vector spaces, the
notion of entanglement provides a means for the various subsystems to
communicate with each other. In this paper we examine a number of
implementations of the categorical framework of Coecke, Sadrzadeh and Clark
(2010) for natural language, from an entanglement perspective. Specifically,
our goal is to better understand in what way the level of entanglement of the
relational tensors (or the lack of it) affects the compositional structures in
practical situations. Our findings reveal that a number of proposals for verb
construction lead to almost separable tensors, a fact that considerably
simplifies the interactions between the words. We examine the ramifications of
this fact, and we show that the use of Frobenius algebras mitigates the
potential problems to a great extent. Finally, we briefly examine a machine
learning method that creates verb tensors exhibiting a sufficient level of
entanglement.Comment: In Proceedings QPL 2014, arXiv:1412.810
Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality (Extended Abstract)
We develop a categorical compositional distributional semantics for Lambek
Calculus with a Relevant Modality, which has a limited version of the
contraction and permutation rules. The categorical part of the semantics is a
monoidal biclosed category with a coalgebra modality as defined on Differential
Categories. We instantiate this category to finite dimensional vector spaces
and linear maps via quantisation functors and work with three concrete
interpretations of the coalgebra modality. We apply the model to construct
categorical and concrete semantic interpretations for the motivating example of
this extended calculus: the derivation of a phrase with a parasitic gap. The
effectiveness of the concrete interpretations are evaluated via a
disambiguation task, on an extension of a sentence disambiguation dataset to
parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and
Relational tensorsComment: In Proceedings ACT 2020, arXiv:2101.07888. arXiv admin note:
substantial text overlap with arXiv:2005.0307
A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras
Categorical compositional distributional semantics is a model of natural
language; it combines the statistical vector space models of words with the
compositional models of grammar. We formalise in this model the generalised
quantifier theory of natural language, due to Barwise and Cooper. The
underlying setting is a compact closed category with bialgebras. We start from
a generative grammar formalisation and develop an abstract categorical
compositional semantics for it, then instantiate the abstract setting to sets
and relations and to finite dimensional vector spaces and linear maps. We prove
the equivalence of the relational instantiation to the truth theoretic
semantics of generalised quantifiers. The vector space instantiation formalises
the statistical usages of words and enables us to, for the first time, reason
about quantified phrases and sentences compositionally in distributional
semantics
Compositional Distributional Semantics with Syntactic Dependencies and Selectional Preferences
This article describes a compositional model based on syntactic dependencies which has been designed to build contextualized word vectors, by following linguistic principles related to the concept of selectional preferences. The compositional strategy proposed in the current work has been evaluated on a syntactically controlled and multilingual dataset, and compared with Transformer BERT-like models, such as Sentence BERT, the state-of-the-art in sentence similarity. For this purpose, we created two new test datasets for Portuguese and Spanish on the basis of that defined for the English language, containing expressions with noun-verb-noun transitive constructions. The results we have obtained show that the linguistic-based compositional approach turns out to be competitive with Transformer modelsThis work has received financial support from DOMINO project (PGC2018-102041-B-I00, MCIU/AEI/FEDER, UE), eRisk project (RTI2018-093336-B-C21), the Conseller铆a de Cultura, Educaci贸n e Ordenaci贸n Universitaria (accreditation 2016-2019, ED431G/08, Groups of Reference: ED431C 2020/21, and ERDF 2014-2020: Call ED431G 2019/04) and the European Regional Development Fund (ERDF)S