117 research outputs found

    More or Less Unnatural: Semantic Similarity Shapes the Learnability and Cross-Linguistic Distribution of Unnatural Syncretism in Morphological Paradigms

    Full text link
    Morphological systems often reuse the same forms in different functions, creating what is known as syncretism. While syncretism varies greatly, certain cross-linguistic tendencies are apparent. Patterns where all syncretic forms share a morphological feature value (e.g., first person, or plural number) are most common cross-linguistically, and this preference is mirrored in results from learning experiments. While this suggests a general bias towards natural (featurally homogeneous) over unnatural (featurally heterogeneous) patterns, little is yet known about gradients in learnability and distributions of different kinds of unnatural patterns. In this paper we assess apparent cross-linguistic asymmetries between different types of unnatural patterns in person-number verbal agreement paradigms and test their learnability in an artificial language learning experiment. We find that the cross-linguistic recurrence of unnatural patterns of syncretism in person-number paradigms is proportional to the amount of shared feature values (i.e., similarity) amongst the syncretic forms. Our experimental results further suggest that the learnability of syncretic patterns also mirrors the paradigm’s degree of feature-value similarity. We propose that this gradient in learnability reflects a general bias towards similarity-based structure in morphological learning, which previous literature has shown to play a crucial role in word learning as well as in category and concept learning more generally. Rather than a dichotomous natural/unnatural distinction, our results thus support a more nuanced view of (un)naturalness in morphological paradigms and suggest that a preference for similarity-based structure during language learning might shape the worldwide transmission and typological distribution of patterns of syncretism

    Learners restrict their linguistic generalizations using preemption but not entrenchment: evidence from artificial language learning studies with adults and children

    Get PDF
    A central goal of research into language acquisition is explaining how, when learners generalize to new cases, they appropriately RESTRICT their generalizations (e.g., to avoid producing ungrammatical utterance such as *The clown laughed the man). The past 30 years have seen an unresolved debate between STATISTICAL PREEMPTION and ENTRENCHMENT as explanations. Under preemption, the use of a verb in a particular construction (e.g., *The clown laughed the man) is probabilistically blocked by hearing that verb other constructions WITH SIMILAR MEANINGS ONLY (e.g., The clown made the man laugh). Under entrenchment, such errors (e.g., *The clown laughed the man) are probabilistically blocked by hearing ANY utterance that includes the relevant verb (e.g., by The clown made the man laugh AND The man laughed). Across five artificial-language-learning studies, we designed a training regime such that learners received evidence for the (by the relevant hypothesis) ungrammaticality of a particular unattested verb/noun+particle combination (e.g., *chila+kem; *squeako+kem) via either preemption only or entrenchment only. Across all five studies, participants in the preemption condition (as per our preregistered prediction) rated unattested verb/noun+particle combinations as less acceptable for restricted verbs/nouns, which appeared during training, than for unrestricted, novel-at-test verbs/nouns, which did not appear during training; i.e., strong evidence for preemption. Participants in the entrenchment condition showed no evidence for such an effect (and in 3/5 experiments, positive evidence for the null). We conclude that a successful model of learning linguistic restrictions must instantiate competition between different forms only where they express the same (or similar) meanings

    A cognitively plausible model for grammar induction

    Full text link

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF

    Exposure and Emergence in Usage-Based Grammar: Computational Experiments in 35 Languages

    Full text link
    This paper uses computational experiments to explore the role of exposure in the emergence of construction grammars. While usage-based grammars are hypothesized to depend on a learner's exposure to actual language use, the mechanisms of such exposure have only been studied in a few constructions in isolation. This paper experiments with (i) the growth rate of the constructicon, (ii) the convergence rate of grammars exposed to independent registers, and (iii) the rate at which constructions are forgotten when they have not been recently observed. These experiments show that the lexicon grows more quickly than the grammar and that the growth rate of the grammar is not dependent on the growth rate of the lexicon. At the same time, register-specific grammars converge onto more similar constructions as the amount of exposure increases. This means that the influence of specific registers becomes less important as exposure increases. Finally, the rate at which constructions are forgotten when they have not been recently observed mirrors the growth rate of the constructicon. This paper thus presents a computational model of usage-based grammar that includes both the emergence and the unentrenchment of constructions

    Entangled Parametric Hierarchies: Problems for an Overspecified Universal Grammar

    Get PDF
    This study addresses the feasibility of the classical notion of parameter in linguistic theory from the perspective of parametric hierarchies. A novel program-based analysis is implemented in order to show certain empirical problems related to these hierarchies. The program was developed on the basis of an enriched data base spanning 23 contemporary and 5 ancient languages. The empirical issues uncovered cast doubt on classical parametric models of language acquisition as well as on the conceptualization of an overspecified Universal Grammar that has parameters among its primitives. Pinpointing these issues leads to the proposal that (i) the (bio)logical problem of language acquisition does not amount to a process of triggering innately pre-wired values of parameters and (ii) it paves the way for viewing language, epigenetic ('parametric') variation as an externalization-related epiphenomenon, whose learning component may be more important than what sometimes is assumed

    Adaptive Structure, Cultural Transmission & Language

    Get PDF
    Over the past 20 years the study of language evolution has made significant leaps towards becoming a serious scientific endeavour

    Testing a computational model of causative overgeneralizations: Child judgment and production data from English, Hebrew, Hindi, Japanese and K'iche'.

    Get PDF
    How do language learners avoid the production of verb argument structure overgeneralization errors ( *The clown laughed the man c.f. The clown made the man laugh), while retaining the ability to apply such generalizations productively when appropriate? This question has long been seen as one that is both particularly central to acquisition research and particularly challenging. Focussing on causative overgeneralization errors of this type, a previous study reported a computational model that learns, on the basis of corpus data and human-derived verb-semantic-feature ratings, to predict adults' by-verb preferences for less- versus more-transparent causative forms (e.g., * The clown laughed the man vs The clown made the man laugh) across English, Hebrew, Hindi, Japanese and K'iche Mayan. Here, we tested the ability of this model (and an expanded version with multiple hidden layers) to explain binary grammaticality judgment data from children aged 4;0-5;0, and elicited-production data from children aged 4;0-5;0 and 5;6-6;6 ( N=48 per language). In general, the model successfully simulated both children's judgment and production data, with correlations of r=0.5-0.6 and r=0.75-0.85, respectively, and also generalized to unseen verbs. Importantly, learners of all five languages showed some evidence of making the types of overgeneralization errors - in both judgments and production - previously observed in naturalistic studies of English (e.g., *I'm dancing it). Together with previous findings, the present study demonstrates that a simple learning model can explain (a) adults' continuous judgment data, (b) children's binary judgment data and (c) children's production data (with no training of these datasets), and therefore constitutes a plausible mechanistic account of the acquisition of verbs' argument structure restrictions

    Formal Syntax and Deep History

    Get PDF
    We show that, contrary to long-standing assumptions, syntactic traits, modeled here within the generative biolinguistic framework, provide insights into deep-time language history. To support this claim, we have encoded the diversity of nominal structures using 94 universally definable binary parameters, set in 69 languages spanning across up to 13 traditionally irreducible Eurasian families. We found a phylogenetic signal that distinguishes all such families and matches the family-internal tree topologies that are safely established through classical etymological methods and datasets. We have retrieved “near-perfect” phylogenies, which are essentially immune to homoplastic disruption and only moderately influenced by horizontal convergence, two factors that instead severely affect more externalized linguistic features, like sound inventories. This result allows us to draw some preliminary inferences about plausible/implausible cross-family classifications; it also provides a new source of evidence for testing the representation of diversity in syntactic theories
    corecore