32 research outputs found

    Prevalence and recoverability of syntactic parameters in sparse distributed memories

    Get PDF
    We propose a new method, based on sparse distributed memory, for studying dependence relations between syntactic parameters in the Principles and Parameters model of Syntax. By storing data of syntactic structures of world languages in a Kanerva network and checking recoverability of corrupted data from the network, we identify two different effects: an overall underlying relation between the prevalence of parameters across languages and their degree of recoverability, and a finer effect that makes some parameters more easily recoverable beyond what their prevalence would indicate. The latter can be seen as an indication of the existence of dependence relations, through which a given parameter can be determined using the remaining uncorrupted data

    Syntactic Structures and Code Parameters

    Get PDF
    We assign binary and ternary error-correcting codes to the data of syntactic structures of world languages and we study the distribution of code points in the space of code parameters. We show that, while most codes populate the lower region approximating a superposition of Thomae functions, there is a substantial presence of codes above the Gilbert-Varshamov bound and even above the asymptotic bound and the Plotkin bound. We investigate the dynamics induced on the space of code parameters by spin glass models of language change, and show that, in the presence of entailment relations between syntactic parameters the dynamics can sometimes improve the code. For large sets of languages and syntactic data, one can gain information on the spin glass dynamics from the induced dynamics in the space of code parameters.Comment: 14 pages, LaTeX, 12 png figure

    Syntactic Parameters and a Coding Theory Perspective on Entropy and Complexity of Language Families

    Get PDF
    We present a simple computational approach to assigning a measure of complexity and information/entropy to families of natural languages, based on syntactic parameters and the theory of error correcting codes. We associate to each language a binary string of syntactic parameters and to a language family a binary code, with code words the binary string associated to each language. We then evaluate the code parameters (rate and relative minimum distance) and the position of the parameters with respect to the asymptotic bound of error correcting codes and the Gilbert–Varshamov bound. These bounds are, respectively, related to the Kolmogorov complexity and the Shannon entropy of the code and this gives us a computationally simple way to obtain estimates on the complexity and information, not of individual languages but of language families. This notion of complexity is related, from the linguistic point of view to the degree of variability of syntactic parameter across languages belonging to the same (historical) family

    Phylogenetics of Indo-European Language Families via an Algebro-Geometric Analysis of Their Syntactic Structures

    Get PDF
    Using Phylogenetic Algebraic Geometry, we analyze computationally the phylogenetic tree of subfamilies of the Indo-European language family, using data of syntactic structures. The two main sources of syntactic data are the SSWL database and Longobardi’s recent data of syntactic parameters. We compute phylogenetic invariants and estimates of the Euclidean distance functions for two sets of Germanic languages, a set of Romance languages, a set of Slavic languages and a set of early Indo-European languages, and we compare the results with what is known through historical linguistics
    corecore