7,012 research outputs found
Composition in distributional models of semantics
Distributional models of semantics have proven themselves invaluable both in cognitive
modelling of semantic phenomena and also in practical applications. For example,
they have been used to model judgments of semantic similarity (McDonald,
2000) and association (Denhire and Lemaire, 2004; Griffiths et al., 2007) and have
been shown to achieve human level performance on synonymy tests (Landuaer and
Dumais, 1997; Griffiths et al., 2007) such as those included in the Test of English as
Foreign Language (TOEFL). This ability has been put to practical use in automatic thesaurus
extraction (Grefenstette, 1994). However, while there has been a considerable
amount of research directed at the most effective ways of constructing representations
for individual words, the representation of larger constructions, e.g., phrases and sentences,
has received relatively little attention. In this thesis we examine this issue of
how to compose meanings within distributional models of semantics to form representations
of multi-word structures.
Natural language data typically consists of such complex structures, rather than
just individual isolated words. Thus, a model of composition, in which individual
word meanings are combined into phrases and phrases combine to form sentences,
is of central importance in modelling this data. Commonly, however, distributional
representations are combined in terms of addition (Landuaer and Dumais, 1997; Foltz
et al., 1998), without any empirical evaluation of alternative choices. Constructing
effective distributional representations of phrases and sentences requires that we have
both a theoretical foundation to direct the development of models of composition and
also a means of empirically evaluating those models.
The approach we take is to first consider the general properties of semantic composition
and from that basis define a comprehensive framework in which to consider
the composition of distributional representations. The framework subsumes existing
proposals, such as addition and tensor products, but also allows us to define novel
composition functions. We then show that the effectiveness of these models can be evaluated on three empirical tasks.
The first of these tasks involves modelling similarity judgements for short phrases
gathered in human experiments. Distributional representations of individual words are
commonly evaluated on tasks based on their ability to model semantic similarity relations,
e.g., synonymy or priming. Thus, it seems appropriate to evaluate phrase representations
in a similar manner. We then apply compositional models to language modelling,
demonstrating that the issue of composition has practical consequences, and
also providing an evaluation based on large amounts of natural data. In our third task,
we use these language models in an analysis of reading times from an eye-movement
study. This allows us to investigate the relationship between the composition of distributional
representations and the processes involved in comprehending phrases and
sentences.
We find that these tasks do indeed allow us to evaluate and differentiate the proposed
composition functions and that the results show a reasonable consistency across
tasks. In particular, a simple multiplicative model is best for a semantic space based
on word co-occurrence, whereas an additive model is better for the topic based model
we consider. More generally, employing compositional models to construct representations
of multi-word structures typically yields improvements in performance over
non-compositonal models, which only represent individual words
Distributional Sentence Entailment Using Density Matrices
Categorical compositional distributional model of Coecke et al. (2010)
suggests a way to combine grammatical composition of the formal, type logical
models with the corpus based, empirical word representations of distributional
semantics. This paper contributes to the project by expanding the model to also
capture entailment relations. This is achieved by extending the representations
of words from points in meaning space to density operators, which are
probability distributions on the subspaces of the space. A symmetric measure of
similarity and an asymmetric measure of entailment is defined, where lexical
entailment is measured using von Neumann entropy, the quantum variant of
Kullback-Leibler divergence. Lexical entailment, combined with the composition
map on word representations, provides a method to obtain entailment relations
on the level of sentences. Truth theoretic and corpus-based examples are
provided.Comment: 11 page
Category-Theoretic Quantitative Compositional Distributional Models of Natural Language Semantics
This thesis is about the problem of compositionality in distributional
semantics. Distributional semantics presupposes that the meanings of words are
a function of their occurrences in textual contexts. It models words as
distributions over these contexts and represents them as vectors in high
dimensional spaces. The problem of compositionality for such models concerns
itself with how to produce representations for larger units of text by
composing the representations of smaller units of text.
This thesis focuses on a particular approach to this compositionality
problem, namely using the categorical framework developed by Coecke, Sadrzadeh,
and Clark, which combines syntactic analysis formalisms with distributional
semantic representations of meaning to produce syntactically motivated
composition operations. This thesis shows how this approach can be
theoretically extended and practically implemented to produce concrete
compositional distributional models of natural language semantics. It
furthermore demonstrates that such models can perform on par with, or better
than, other competing approaches in the field of natural language processing.
There are three principal contributions to computational linguistics in this
thesis. The first is to extend the DisCoCat framework on the syntactic front
and semantic front, incorporating a number of syntactic analysis formalisms and
providing learning procedures allowing for the generation of concrete
compositional distributional models. The second contribution is to evaluate the
models developed from the procedures presented here, showing that they
outperform other compositional distributional models present in the literature.
The third contribution is to show how using category theory to solve linguistic
problems forms a sound basis for research, illustrated by examples of work on
this topic, that also suggest directions for future research.Comment: DPhil Thesis, University of Oxford, Submitted and accepted in 201
"Not not bad" is not "bad": A distributional account of negation
With the increasing empirical success of distributional models of
compositional semantics, it is timely to consider the types of textual logic
that such models are capable of capturing. In this paper, we address
shortcomings in the ability of current models to capture logical operations
such as negation. As a solution we propose a tripartite formulation for a
continuous vector space representation of semantics and subsequently use this
representation to develop a formal compositional notion of negation within such
models.Comment: 9 pages, to appear in Proceedings of the 2013 Workshop on Continuous
Vector Space Models and their Compositionalit
Don't Blame Distributional Semantics if it can't do Entailment
Distributional semantics has had enormous empirical success in Computational
Linguistics and Cognitive Science in modeling various semantic phenomena, such
as semantic similarity, and distributional models are widely used in
state-of-the-art Natural Language Processing systems. However, the theoretical
status of distributional semantics within a broader theory of language and
cognition is still unclear: What does distributional semantics model? Can it
be, on its own, a fully adequate model of the meanings of linguistic
expressions? The standard answer is that distributional semantics is not fully
adequate in this regard, because it falls short on some of the central aspects
of formal semantic approaches: truth conditions, entailment, reference, and
certain aspects of compositionality. We argue that this standard answer rests
on a misconception: These aspects do not belong in a theory of expression
meaning, they are instead aspects of speaker meaning, i.e., communicative
intentions in a particular context. In a slogan: words do not refer, speakers
do. Clearing this up enables us to argue that distributional semantics on its
own is an adequate model of expression meaning. Our proposal sheds light on the
role of distributional semantics in a broader theory of language and cognition,
its relationship to formal semantics, and its place in computational models.Comment: To appear in Proceedings of the 13th International Conference on
Computational Semantics (IWCS 2019), Gothenburg, Swede
Semantic Composition via Probabilistic Model Theory
Semantic composition remains an open problem for vector space models of semantics. In this paper, we explain how the probabilistic graphical model used in the framework of Functional Distributional Semantics can be interpreted as a probabilistic version of model theory. Building on this, we explain how various semantic phenomena can be recast in terms of conditional probabilities in the graphical model. This connection between formal semantics and machine learning is helpful in both directions: it gives us an explicit mechanism for modelling context-dependent meanings (a challenge for formal semantics), and also gives us well-motivated techniques for composing distributed representations (a challenge for distributional semantics). We present results on two datasets that go beyond word similarity, showing how these semantically-motivated techniques improve on the performance of vector models.Schiff Foundatio
- …