60 research outputs found

    The cross-linguistic performance of word segmentation models over time.

    Get PDF
    We select three word segmentation models with psycholinguistic foundations - transitional probabilities, the diphone-based segmenter, and PUDDLE - which track phoneme co-occurrence and positional frequencies in input strings, and in the case of PUDDLE build lexical and diphone inventories. The models are evaluated on caregiver utterances in 132 CHILDES corpora representing 28 languages and 11.9 m words. PUDDLE shows the best performance overall, albeit with wide cross-linguistic variation. We explore the reasons for this variation, fitting regression models to performance scores with linguistic properties which capture lexico-phonological characteristics of the input: word length, utterance length, diversity in the lexicon, the frequency of one-word utterances, the regularity of phoneme patterns at word boundaries, and the distribution of diphones in each language. These properties together explain four-tenths of the observed variation in segmentation performance, a strong outcome and a solid foundation for studying further variables which make the segmentation task difficult

    What have we learned from 15  years of research on cross-situational word learning? A focused review

    Get PDF
    In 2007 and 2008, Yu and Smith published their seminal studies on cross-situational word learning (CSWL) in adults and infants, showing that word-object-mappings can be acquired from distributed statistics despite in-the-moment uncertainty. Since then, the CSWL paradigm has been used extensively to better understand (statistical) word learning in different language learners and under different learning conditions. The goal of this review is to provide an entry-level overview of findings and themes that have emerged in 15 years of research on CSWL across three topic areas (mechanisms of CSWL, CSWL across different learner and task characteristics) and to highlight the questions that remain to be answered

    Computational and Robotic Models of Early Language Development: A Review

    Get PDF
    We review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments. Keywords: Early language learning, Computational and robotic models, machine learning, development, embodiment, social interaction, intrinsic motivation, self-organization, dynamical systems, complexity.Comment: to appear in International Handbook on Language Development, ed. J. Horst and J. von Koss Torkildsen, Routledg

    The infant’s view redefines the problem of referential uncertainty in early word learning

    Get PDF
    The learning of first object names is deemed a hard problem due to the uncertainty inherent in mapping a heard name to the intended referent in a cluttered and variable world. However, human infants readily solve this problem. Despite considerable theoretical discussion, relatively little is known about the uncertainty infants face in the real world. We used head-mounted eye tracking during parent–infant toy play and quantified the uncertainty by measuring the distribution of infant attention to the potential referents when a parent named both familiar and unfamiliar toy objects. The results show that infant gaze upon hearing an object name is often directed to a single referent which is equally likely to be a wrong competitor or the intended target. This bimodal gaze distribution clarifies and redefines the uncertainty problem and constrains possible solutions

    The Social Network Dynamics Of Category Formation

    Get PDF
    Category systems are remarkably consistent across societies. Stable partitions for concepts relating to flora, geometry, emotion, color, and kinship have been repeatedly discovered across diverse cultures. Canonical theories in cognitive science argue that this form of convergence across independent populations, referred to as ‘cross-cultural convergence’, is evidence of innate human categories that exist independently of social interaction. However, a number of studies have shown that even individuals from the same population can vary substantially in how they categorize novel and ambiguous phenomena. Contrary to findings on cross-cultural convergence, this individual variation in categorization processes suggests that independent populations should evolve highly divergent category systems (as is often predicted by theories of social constructivism). These puzzling findings raise new questions about the origins of cross-cultural convergence. In this dissertation, I develop a new mathematical approach to cultural processes of category formation, which shows that whether or not independent populations create similar category systems is a function of population size. Specifically, my model shows that small populations frequently diverge in their category systems, whereas in large populations, a subset of categories consistently reach critical mass and spread, leading to convergent cultural trajectories. I test and confirm this prediction in a large-scale online social network experiment where I study how small and large social networks construct original category systems for a continuum of novel and ambiguous stimuli. I conclude by discussing the implications of these results for networked crowdsourcing, which harnesses coordination in communication networks to enhance content management and generation across a wide range of domains, including content moderation over social media and scientific classification in citizen science

    Computational and Robotic Models of Early Language Development: A Review

    Get PDF
    International audienceWe review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF
    • …
    corecore