10 research outputs found
Category-Theoretic Quantitative Compositional Distributional Models of Natural Language Semantics
This thesis is about the problem of compositionality in distributional
semantics. Distributional semantics presupposes that the meanings of words are
a function of their occurrences in textual contexts. It models words as
distributions over these contexts and represents them as vectors in high
dimensional spaces. The problem of compositionality for such models concerns
itself with how to produce representations for larger units of text by
composing the representations of smaller units of text.
This thesis focuses on a particular approach to this compositionality
problem, namely using the categorical framework developed by Coecke, Sadrzadeh,
and Clark, which combines syntactic analysis formalisms with distributional
semantic representations of meaning to produce syntactically motivated
composition operations. This thesis shows how this approach can be
theoretically extended and practically implemented to produce concrete
compositional distributional models of natural language semantics. It
furthermore demonstrates that such models can perform on par with, or better
than, other competing approaches in the field of natural language processing.
There are three principal contributions to computational linguistics in this
thesis. The first is to extend the DisCoCat framework on the syntactic front
and semantic front, incorporating a number of syntactic analysis formalisms and
providing learning procedures allowing for the generation of concrete
compositional distributional models. The second contribution is to evaluate the
models developed from the procedures presented here, showing that they
outperform other compositional distributional models present in the literature.
The third contribution is to show how using category theory to solve linguistic
problems forms a sound basis for research, illustrated by examples of work on
this topic, that also suggest directions for future research.Comment: DPhil Thesis, University of Oxford, Submitted and accepted in 201
Compositional Distributional Semantics with Compact Closed Categories and Frobenius Algebras
This thesis contributes to ongoing research related to the categorical
compositional model for natural language of Coecke, Sadrzadeh and Clark in
three ways: Firstly, I propose a concrete instantiation of the abstract
framework based on Frobenius algebras (joint work with Sadrzadeh). The theory
improves shortcomings of previous proposals, extends the coverage of the
language, and is supported by experimental work that improves existing results.
The proposed framework describes a new class of compositional models that find
intuitive interpretations for a number of linguistic phenomena. Secondly, I
propose and evaluate in practice a new compositional methodology which
explicitly deals with the different levels of lexical ambiguity (joint work
with Pulman). A concrete algorithm is presented, based on the separation of
vector disambiguation from composition in an explicit prior step. Extensive
experimental work shows that the proposed methodology indeed results in more
accurate composite representations for the framework of Coecke et al. in
particular and every other class of compositional models in general. As a last
contribution, I formalize the explicit treatment of lexical ambiguity in the
context of the categorical framework by resorting to categorical quantum
mechanics (joint work with Coecke). In the proposed extension, the concept of a
distributional vector is replaced with that of a density matrix, which
compactly represents a probability distribution over the potential different
meanings of the specific word. Composition takes the form of quantum
measurements, leading to interesting analogies between quantum physics and
linguistics.Comment: Ph.D. Dissertation, University of Oxfor
Metafictional anaphora:A comparison of different accounts
I argue that pronominal anaphora across mixed parafictional/ metafictional discourse (e.g. In The Lord of the Rings, Frodoi goes through an immense mental struggle. Hei is an intriguing fictional character! ) poses a problem for a workspace account. I evaluate different possible solutions based on a descriptivist approach, Zalta's logic of abstract objects and Recanati's dot-object theory
Mathematical linguistics
but in fact this is still an early draft, version 0.56, August 1 2001. Please d