20 research outputs found

    Vector Space Semantics for Lambek Calculus with Soft Subexponentials

    Full text link
    We develop a vector space semantics for Lambek Calculus with Soft Subexponentials, apply the calculus to construct compositional vector interpretations for parasitic gap noun phrases and discourse units with anaphora and ellipsis, and experiment with the constructions in a distributional sentence similarity task. As opposed to previous work, which used Lambek Calculus with a Relevant Modality the calculus used in this paper uses a bounded version of the modality and is decidable. The vector space semantics of this new modality allows us to meaningfully define contraction as projection and provide a linear theory behind what we could previously only achieve via nonlinear maps

    A Compositional Vector Space Model of Ellipsis and Anaphora.

    Get PDF
    PhD ThesisThis thesis discusses research in compositional distributional semantics: if words are defined by their use in language and represented as high-dimensional vectors reflecting their co-occurrence behaviour in textual corpora, how should words be composed to produce a similar numerical representation for sentences, paragraphs and documents? Neural methods learn a task-dependent composition by generalising over large datasets, whereas type-driven approaches stipulate that composition is given by a functional view on words, leaving open the question of what those functions should do, concretely. We take on the type-driven approach to compositional distributional semantics and focus on the categorical framework of Coecke, Grefenstette, and Sadrzadeh [CGS13], which models composition as an interpretation of syntactic structures as linear maps on vector spaces using the language of category theory, as well as the two-step approach of Muskens and Sadrzadeh [MS16], where syntactic structures map to lambda logical forms that are instantiated by a concrete composition model. We develop the theory behind these approaches to cover phenomena not dealt with in previous work, evaluate the models in sentence-level tasks, and implement a tensor learning method that generalises to arbitrary sentences. This thesis reports three main contributions. The first, theoretical in nature, discusses the ability of categorical and lambda-based models of compositional distributional semantics to model ellipsis, anaphora, and parasitic gaps; phenomena that challenge the linearity of previous compositional models. Secondly, we perform an evaluation study on verb phrase ellipsis where we introduce three novel sentence evaluation datasets and compare algebraic, neural, and tensor-based composition models to show that models that resolve ellipsis achieve higher correlation with humans. Finally, we generalise the skipgram model [Mik+13] to a tensor-based setting and implement it for transitive verbs, showing that neural methods to learn tensor representations for words can outperform previous tensor-based methods on compositional tasks

    A Frobenius Algebraic Analysis for Parasitic Gaps

    Get PDF
    The interpretation of parasitic gaps is an ostensible case of non-linearity in natural language composition. Existing categorial analyses, both in the typelogical and in the combinatory traditions, rely on explicit forms of syntactic copying. We identify two types of parasitic gapping where the duplication of semantic content can be confined to the lexicon. Parasitic gaps in adjuncts are analysed as forms of generalized coordination with a polymorphic type schema for the head of the adjunct phrase. For parasitic gaps affecting arguments of the same predicate, the polymorphism is associated with the lexical item that introduces the primary gap. Our analysis is formulated in terms of Lambek calculus extended with structural control modalities. A compositional translation relates syntactic types and derivations to the interpreting compact closed category of finite dimensional vector spaces and linear maps with Frobenius algebras over it. When interpreted over the necessary semantic spaces, the Frobenius algebras provide the tools to model the proposed instances of lexical polymorphism.Comment: SemSpace 2019, to appear in Journal of Applied Logic

    Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality (Extended Abstract)

    Get PDF
    We develop a categorical compositional distributional semantics for Lambek Calculus with a Relevant Modality, which has a limited version of the contraction and permutation rules. The categorical part of the semantics is a monoidal biclosed category with a coalgebra modality as defined on Differential Categories. We instantiate this category to finite dimensional vector spaces and linear maps via quantisation functors and work with three concrete interpretations of the coalgebra modality. We apply the model to construct categorical and concrete semantic interpretations for the motivating example of this extended calculus: the derivation of a phrase with a parasitic gap. The effectiveness of the concrete interpretations are evaluated via a disambiguation task, on an extension of a sentence disambiguation dataset to parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and Relational tensorsComment: In Proceedings ACT 2020, arXiv:2101.07888. arXiv admin note: substantial text overlap with arXiv:2005.0307

    Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality

    Get PDF
    We develop a categorical compositional distributional semantics for Lambek Calculus with a Relevant Modality !L*, which has a limited edition of the contraction and permutation rules. The categorical part of the semantics is a monoidal biclosed category with a coalgebra modality, very similar to the structure of a Differential Category. We instantiate this category to finite dimensional vector spaces and linear maps via "quantisation" functors and work with three concrete interpretations of the coalgebra modality. We apply the model to construct categorical and concrete semantic interpretations for the motivating example of !L*: the derivation of a phrase with a parasitic gap. The effectiveness of the concrete interpretations are evaluated via a disambiguation task, on an extension of a sentence disambiguation dataset to parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and Relational tensors

    Lambda-calculus and formal language theory

    Get PDF
    Formal and symbolic approaches have offered computer science many application fields. The rich and fruitful connection between logic, automata and algebra is one such approach. It has been used to model natural languages as well as in program verification. In the mathematics of language it is able to model phenomena ranging from syntax to phonology while in verification it gives model checking algorithms to a wide family of programs. This thesis extends this approach to simply typed lambda-calculus by providing a natural extension of recognizability to programs that are representable by simply typed terms. This notion is then applied to both the mathematics of language and program verification. In the case of the mathematics of language, it is used to generalize parsing algorithms and to propose high-level methods to describe languages. Concerning program verification, it is used to describe methods for verifying the behavioral properties of higher-order programs. In both cases, the link that is drawn between finite state methods and denotational semantics provide the means to mix powerful tools coming from the two worlds

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

    Proof nets for linguistic analysis

    Get PDF
    This book investigates the possible linguistic applications of proof nets, redundancy free representations of proofs, which were introduced by Girard for linear logic. We will adapt the notion of proof net to allow the formulation of a proof net calculus which is soundand complete for the multimodal Lambek calculus. Finally, we will investigate the computational and complexity theoretic consequences of this calculus and give an introduction to a practical grammar development tool based on proof nets

    Head-Driven Phrase Structure Grammar

    Get PDF
    Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)
    corecore