20 research outputs found
Vector Space Semantics for Lambek Calculus with Soft Subexponentials
We develop a vector space semantics for Lambek Calculus with Soft
Subexponentials, apply the calculus to construct compositional vector
interpretations for parasitic gap noun phrases and discourse units with
anaphora and ellipsis, and experiment with the constructions in a
distributional sentence similarity task. As opposed to previous work, which
used Lambek Calculus with a Relevant Modality the calculus used in this paper
uses a bounded version of the modality and is decidable. The vector space
semantics of this new modality allows us to meaningfully define contraction as
projection and provide a linear theory behind what we could previously only
achieve via nonlinear maps
A Compositional Vector Space Model of Ellipsis and Anaphora.
PhD ThesisThis thesis discusses research in compositional distributional semantics: if words
are defined by their use in language and represented as high-dimensional vectors
reflecting their co-occurrence behaviour in textual corpora, how should words be
composed to produce a similar numerical representation for sentences, paragraphs
and documents? Neural methods learn a task-dependent composition by generalising
over large datasets, whereas type-driven approaches stipulate that composition
is given by a functional view on words, leaving open the question of what those
functions should do, concretely.
We take on the type-driven approach to compositional distributional semantics
and focus on the categorical framework of Coecke, Grefenstette, and Sadrzadeh
[CGS13], which models composition as an interpretation of syntactic structures as
linear maps on vector spaces using the language of category theory, as well as the
two-step approach of Muskens and Sadrzadeh [MS16], where syntactic structures
map to lambda logical forms that are instantiated by a concrete composition model.
We develop the theory behind these approaches to cover phenomena not dealt with
in previous work, evaluate the models in sentence-level tasks, and implement a tensor
learning method that generalises to arbitrary sentences.
This thesis reports three main contributions. The first, theoretical in nature, discusses
the ability of categorical and lambda-based models of compositional distributional
semantics to model ellipsis, anaphora, and parasitic gaps; phenomena that
challenge the linearity of previous compositional models. Secondly, we perform an
evaluation study on verb phrase ellipsis where we introduce three novel sentence
evaluation datasets and compare algebraic, neural, and tensor-based composition
models to show that models that resolve ellipsis achieve higher correlation with humans.
Finally, we generalise the skipgram model [Mik+13] to a tensor-based setting
and implement it for transitive verbs, showing that neural methods to learn tensor
representations for words can outperform previous tensor-based methods on compositional
tasks
A Frobenius Algebraic Analysis for Parasitic Gaps
The interpretation of parasitic gaps is an ostensible case of non-linearity
in natural language composition. Existing categorial analyses, both in the
typelogical and in the combinatory traditions, rely on explicit forms of
syntactic copying. We identify two types of parasitic gapping where the
duplication of semantic content can be confined to the lexicon. Parasitic gaps
in adjuncts are analysed as forms of generalized coordination with a
polymorphic type schema for the head of the adjunct phrase. For parasitic gaps
affecting arguments of the same predicate, the polymorphism is associated with
the lexical item that introduces the primary gap. Our analysis is formulated in
terms of Lambek calculus extended with structural control modalities. A
compositional translation relates syntactic types and derivations to the
interpreting compact closed category of finite dimensional vector spaces and
linear maps with Frobenius algebras over it. When interpreted over the
necessary semantic spaces, the Frobenius algebras provide the tools to model
the proposed instances of lexical polymorphism.Comment: SemSpace 2019, to appear in Journal of Applied Logic
Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality (Extended Abstract)
We develop a categorical compositional distributional semantics for Lambek
Calculus with a Relevant Modality, which has a limited version of the
contraction and permutation rules. The categorical part of the semantics is a
monoidal biclosed category with a coalgebra modality as defined on Differential
Categories. We instantiate this category to finite dimensional vector spaces
and linear maps via quantisation functors and work with three concrete
interpretations of the coalgebra modality. We apply the model to construct
categorical and concrete semantic interpretations for the motivating example of
this extended calculus: the derivation of a phrase with a parasitic gap. The
effectiveness of the concrete interpretations are evaluated via a
disambiguation task, on an extension of a sentence disambiguation dataset to
parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and
Relational tensorsComment: In Proceedings ACT 2020, arXiv:2101.07888. arXiv admin note:
substantial text overlap with arXiv:2005.0307
Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality
We develop a categorical compositional distributional semantics for Lambek
Calculus with a Relevant Modality !L*, which has a limited edition of the
contraction and permutation rules. The categorical part of the semantics is a
monoidal biclosed category with a coalgebra modality, very similar to the
structure of a Differential Category. We instantiate this category to finite
dimensional vector spaces and linear maps via "quantisation" functors and work
with three concrete interpretations of the coalgebra modality. We apply the
model to construct categorical and concrete semantic interpretations for the
motivating example of !L*: the derivation of a phrase with a parasitic gap. The
effectiveness of the concrete interpretations are evaluated via a
disambiguation task, on an extension of a sentence disambiguation dataset to
parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and
Relational tensors
Lambda-calculus and formal language theory
Formal and symbolic approaches have offered computer science many application fields. The rich and fruitful connection between logic, automata and algebra is one such approach. It has been used to model natural languages as well as in program verification. In the mathematics of language it is able to model phenomena ranging from syntax to phonology while in verification it gives model checking algorithms to a wide family of programs. This thesis extends this approach to simply typed lambda-calculus by providing a natural extension of recognizability to programs that are representable by simply typed terms. This notion is then applied to both the mathematics of language and program verification. In the case of the mathematics of language, it is used to generalize parsing algorithms and to propose high-level methods to describe languages. Concerning program verification, it is used to describe methods for verifying the behavioral properties of higher-order programs. In both cases, the link that is drawn between finite state methods and denotational semantics provide the means to mix powerful tools coming from the two worlds
Head-Driven Phrase Structure Grammar
Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)
Proof nets for linguistic analysis
This book investigates the possible linguistic applications of proof nets, redundancy free
representations of proofs, which were introduced by Girard for linear logic.
We will adapt the notion of proof net to allow the formulation of a proof net calculus which is soundand complete for the multimodal Lambek calculus.
Finally, we will investigate the computational and complexity theoretic consequences of this calculus and give an introduction to a practical grammar development tool based on proof nets
Head-Driven Phrase Structure Grammar
Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)