300 research outputs found

    Simple K-star Categorial Dependency Grammars and their Inference

    Get PDF
    International audienceWe propose a novel subclass in the family of Categorial Dependency Grammars (CDG), based on a syntactic criterion on categorial types associated to words in the lexicon and study its learnability. This proposal relies on a linguistic principle and relates to a former non-constructive condition on iterated dependencies. We show that the projective CDG in this subclass are incrementally learnable in the limit from dependency structures. In contrast to previous proposals, our criterion is both syntactic and does not impose a (rigidity) bound on the number of categorial types associated to a word

    On Pregroups, Freedom, and (Virtual) Conceptual Necessity

    Get PDF
    Pregroups were introduced in (Lambek, 1999), and provide a founda-tion for a particularly simple syntactic calculus. Buszkowski (2001) showed that free pregroup grammars generate exactly the -free context-free lan-guages. Here we characterize the class of languages generable by all pre-groups, which will be shown to be the entire class of recursively enumerable languages. To show this result, we rely on the well-known representation of recursively enumerable languages as the homomorphic image of the inter-section of two context-free languages (Ginsburg et al., 1967). We define an operation of cross-product over grammars (so-called because of its behaviour on the types), and show that the cross-product of any two free-pregroup grammars generates exactly the intersection of their respective languages. The representation theorem applies once we show that allowing ‘empty cat-egories ’ (i.e. lexical items without overt phonological content) allows us to mimic the effects of any string homomorphism.

    Derivation and structure in categorial grammar

    Get PDF

    On Internal Merge

    Get PDF

    Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax

    Get PDF
    We show that both an LSTM and a unitary-evolution recurrent neural network (URN) can achieve encouraging accuracy on two types of syntactic patterns: context-free long distance agreement, and mildly context-sensitive cross serial dependencies. This work extends recent experiments on deeply nested context-free long distance dependencies, with similar results. URNs differ from LSTMs in that they avoid non-linear activation functions, and they apply matrix multiplication to word embeddings encoded as unitary matrices. This permits them to retain all information in the processing of an input string over arbitrary distances. It also causes them to satisfy strict compositionality. URNs constitute a significant advance in the search for explainable models in deep learning applied to NLP

    Implementing the Process Tracing Technique using Combinatory Categorial Grammars: An Application to the Analysis of Economic Coordination within Firms

    Get PDF
    This paper describes a method for analyzing the evolutionary path of a complex, dynamic, and contingent social phenomenon. Given empirical evidence of a surprising or anomalous fact that contradicts a widely acknowledged theory, the aim is to create a plausible explanation based on its context of occurrence, taking a holistic and historical point of view. The procedure begins by translating theoretical propositions into grammar rules that describe patterns of sequences of either individual actions or interactions carried out by a stable community of actors, such as types of decision-making events. Subsequently, applying a process tracing technique based on the logic of retroduction creates an extension of this initial process category, relying on configurations of contextual conditions that acknowledge the surprising fact as a new event outcome in a specific empirical setting. Finally, a structural comparison between pairs of representative instances may lead to the refinement of the theory

    Category-Theoretic Quantitative Compositional Distributional Models of Natural Language Semantics

    Full text link
    This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produce representations for larger units of text by composing the representations of smaller units of text. This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations. This thesis shows how this approach can be theoretically extended and practically implemented to produce concrete compositional distributional models of natural language semantics. It furthermore demonstrates that such models can perform on par with, or better than, other competing approaches in the field of natural language processing. There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The second contribution is to evaluate the models developed from the procedures presented here, showing that they outperform other compositional distributional models present in the literature. The third contribution is to show how using category theory to solve linguistic problems forms a sound basis for research, illustrated by examples of work on this topic, that also suggest directions for future research.Comment: DPhil Thesis, University of Oxford, Submitted and accepted in 201
    • 

    corecore