73 research outputs found

    An efficient communication analysis of morpho-syntactic grammatical features

    Get PDF
    Grammatical features vary widely across languages and thisvariation has been studied in detail. The functions of gram-matical features, however, are not entirely clear and a numberof puzzles remain. For example, why do some languages haverich feature inventories but others have few if any grammaticalfeatures? Why do many languages have features that appearto encode semantic information (e.g. animacy) that is alreadyknown to the listener? We present a computational frameworkthat addresses questions like these by formalizing one way inwhich grammatical features aid communication. We use themodel to illustrate how morpho-syntactic feature inventorieshelp to solve the problem of communicating semantic struc-tures under cognitive pressures

    Logical word learning: The case of kinship

    Get PDF
    We examine the conceptual development of kinship through the lens of program induction. We present a computational model for the acquisition of kinship term concepts, resulting in the first computational model of kinship learning that is closely tied to developmental phenomena. We demonstrate that our model can learn several kinship systems of varying complexity using cross-linguistic data from English, Pukapuka, Turkish, and Yanomamö. More importantly, the behavioral patterns observed in children learning kinship terms, under-extension and over-generalization, fall out naturally from our learning model. We then conducted interviews to simulate realistic learning environments and demonstrate that the characteristic-to-defining shift is a consequence of our learning model in naturalistic contexts containing abstract and concrete features. We use model simulations to understand the influence of logical simplicity and children’s learning environment on the order of acquisition of kinship terms, providing novel predictions for the learning trajectories of these words. We conclude with a discussion of how this model framework generalizes beyond kinship terms, as well as a discussion of its limitations

    How Data Drive Early Word Learning: A Cross-Linguistic Waiting Time Analysis

    Get PDF
    The extent to which word learning is delayed by maturation as opposed to accumulating data is a longstanding question in language acquisition. Further, the precise way in which data influence learning on a large scale is unknown—experimental results reveal that children can rapidly learn words from single instances as well as by aggregating ambiguous information across multiple situations. We analyze Wordbank, a large cross-linguistic dataset of word acquisition norms, using a statistical waiting time model to quantify the role of data in early language learning, building off Hidaka (2013). We find that the model both fits and accurately predicts the shape of children’s growth curves. Further analyses of model parameters suggest a primarily data-driven account of early word learning. The parameters of the model directly characterize both the amount of data required and the rate at which informative data occurs. With high statistical certainty, words require on the order of ∼ 10 learning instances, which occur on average once every two months. Our method is extremely simple, statistically principled, and broadly applicable to modeling data-driven learning effects in development

    Contrast perception as a visual heuristic in the formulation of referential expressions

    Get PDF
    We hypothesize that contrast perception works as a visual heuristic, such that when speakers perceive a significant degree of contrast in a visual context, they tend to produce the corresponding adjective to describe a referent. The contrast perception heuristic supports efficient audience design, allowing speakers to produce referential expressions with minimum expenditure of cognitive resources, while facilitating the listener's visual search for the referent. We tested the perceptual contrast hypothesis in three language-production experiments. Experiment 1 revealed that speakers overspecify color adjectives in polychrome displays, whereas in monochrome displays they overspecified other properties that were contrastive. Further support for the contrast perception hypothesis comes from a re-analysis of previous work, which confirmed that color contrast elicits color overspecification when detected in a given display, but not when detected across monochrome trials. Experiment 2 revealed that even atypical colors (which are often overspecified) are only mentioned if there is color contrast. In Experiment 3, participants named a target color faster in monochrome than in polychrome displays, suggesting that the effect of color contrast is not analogous to ease of production. We conclude that the tendency to overspecify color in polychrome displays is not a bottom-up effect driven by the visual salience of color as a property, but possibly a learned communicative strategy. We discuss the implications of our account for pragmatic theories of referential communication and models of audience design, challenging the view that overspecification is a form of egocentric behavior

    What did I sign? A study of the impenetrability of legalese in contracts

    Get PDF
    Legal documents, in the form of terms of service agreements and other private contracts, are now an increasingly prevalent part of everyday life. While legal documents have long been acknowledged to be difficult to understand without training, it remains an open question whether the ever-increasing exposure to contracts might have mitigated this difficulty. Moreover, insofar as this difficulty has persisted, there remains no systematic analysis of which linguistic structures contribute most heavily to the processing difficulty of legal texts, nor whether this difficulty is heightened for those with less language experience. Here, we investigate these issues, and in a well-powered experiment find evidence that (a) both recall and comprehension of legal propositions in a contract are hindered by use of a legal register relative to plain-English translations; (b) certain linguistic structures, such as center-embedding, hinder recall to a greater degree than others, such as passive voice; and (c) language experience influences comprehension of legal propositions. Surprisingly, language experience did not influence recall, nor was there an interaction between legal register and language experience on recall or comprehension. These findings suggest that legal language poses heightened difficulties for those with less language experience--who tend to be of lower socioeconomic status and with diminished access to the justice system--and that eliminating complex features of legalese would benefit those of all reading levels
    corecore