60 research outputs found
The cross-linguistic performance of word segmentation models over time.
We select three word segmentation models with psycholinguistic foundations - transitional probabilities, the diphone-based segmenter, and PUDDLE - which track phoneme co-occurrence and positional frequencies in input strings, and in the case of PUDDLE build lexical and diphone inventories. The models are evaluated on caregiver utterances in 132 CHILDES corpora representing 28 languages and 11.9 m words. PUDDLE shows the best performance overall, albeit with wide cross-linguistic variation. We explore the reasons for this variation, fitting regression models to performance scores with linguistic properties which capture lexico-phonological characteristics of the input: word length, utterance length, diversity in the lexicon, the frequency of one-word utterances, the regularity of phoneme patterns at word boundaries, and the distribution of diphones in each language. These properties together explain four-tenths of the observed variation in segmentation performance, a strong outcome and a solid foundation for studying further variables which make the segmentation task difficult
What have we learned from 15  years of research on cross-situational word learning? A focused review
In 2007 and 2008, Yu and Smith published their seminal studies on cross-situational word learning (CSWL) in adults and infants, showing that word-object-mappings can be acquired from distributed statistics despite in-the-moment uncertainty. Since then, the CSWL paradigm has been used extensively to better understand (statistical) word learning in different language learners and under different learning conditions. The goal of this review is to provide an entry-level overview of findings and themes that have emerged in 15 years of research on CSWL across three topic areas (mechanisms of CSWL, CSWL across different learner and task characteristics) and to highlight the questions that remain to be answered
Computational and Robotic Models of Early Language Development: A Review
We review computational and robotics models of early language learning and
development. We first explain why and how these models are used to understand
better how children learn language. We argue that they provide concrete
theories of language learning as a complex dynamic system, complementing
traditional methods in psychology and linguistics. We review different modeling
formalisms, grounded in techniques from machine learning and artificial
intelligence such as Bayesian and neural network approaches. We then discuss
their role in understanding several key mechanisms of language development:
cross-situational statistical learning, embodiment, situated social
interaction, intrinsically motivated learning, and cultural evolution. We
conclude by discussing future challenges for research, including modeling of
large-scale empirical data about language acquisition in real-world
environments.
Keywords: Early language learning, Computational and robotic models, machine
learning, development, embodiment, social interaction, intrinsic motivation,
self-organization, dynamical systems, complexity.Comment: to appear in International Handbook on Language Development, ed. J.
Horst and J. von Koss Torkildsen, Routledg
The infant’s view redefines the problem of referential uncertainty in early word learning
The learning of first object names is deemed a hard problem due to the uncertainty inherent in mapping a heard name to the intended referent in a cluttered and variable world. However, human infants readily solve this problem. Despite considerable theoretical discussion, relatively little is known about the uncertainty infants face in the real world. We used head-mounted eye tracking during parent–infant toy play and quantified the uncertainty by measuring the distribution of infant attention to the potential referents when a parent named both familiar and unfamiliar toy objects. The results show that infant gaze upon hearing an object name is often directed to a single referent which is equally likely to be a wrong competitor or the intended target. This bimodal gaze distribution clarifies and redefines the uncertainty problem and constrains possible solutions
The Social Network Dynamics Of Category Formation
Category systems are remarkably consistent across societies. Stable partitions for concepts relating to flora, geometry, emotion, color, and kinship have been repeatedly discovered across diverse cultures. Canonical theories in cognitive science argue that this form of convergence across independent populations, referred to as ‘cross-cultural convergence’, is evidence of innate human categories that exist independently of social interaction. However, a number of studies have shown that even individuals from the same population can vary substantially in how they categorize novel and ambiguous phenomena. Contrary to findings on cross-cultural convergence, this individual variation in categorization processes suggests that independent populations should evolve highly divergent category systems (as is often predicted by theories of social constructivism). These puzzling findings raise new questions about the origins of cross-cultural convergence. In this dissertation, I develop a new mathematical approach to cultural processes of category formation, which shows that whether or not independent populations create similar category systems is a function of population size. Specifically, my model shows that small populations frequently diverge in their category systems, whereas in large populations, a subset of categories consistently reach critical mass and spread, leading to convergent cultural trajectories. I test and confirm this prediction in a large-scale online social network experiment where I study how small and large social networks construct original category systems for a continuum of novel and ambiguous stimuli. I conclude by discussing the implications of these results for networked crowdsourcing, which harnesses coordination in communication networks to enhance content management and generation across a wide range of domains, including content moderation over social media and scientific classification in citizen science
Computational and Robotic Models of Early Language Development: A Review
International audienceWe review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments
- …