158 research outputs found
Image Free-Viewing as Intrinsically-Motivated Exploration: Estimating the Learnability of Center-of-Gaze Image Samples in Infants and Adults
We propose that free viewing of natural images in human infants can be understood and analyzed as the product of intrinsically-motivated visual exploration. We examined this idea by first generating five sets of center-of-gaze (COG) image samples, which were derived by presenting a series of natural images to groups of both real observers (i.e., 9-month-olds and adults) and artificial observers (i.e., an image-saliency model, an image-entropy model, and a random-gaze model). In order to assess the sequential learnability of the COG samples, we paired each group of samples with a simple recurrent network, which was trained to reproduce the corresponding sequence of COG samples. We then asked whether an intrinsically-motivated artificial agent would learn to identify the most successful network. In Simulation 1, the agent was rewarded for selecting the observer group and network with the lowest prediction errors, while in Simulation 2 the agent was rewarded for selecting the observer group and network with the largest rate of improvement. Our prediction was that if visual exploration in infants is intrinsically-motivatedâand more specifically, the goal of exploration is to learn to produce sequentially-predictable gaze patternsâthen the agent would show a preference for the COG samples produced by the infants over the other four observer groups. The results from both simulations supported our prediction. We conclude by highlighting the implications of our approach for understanding visual development in infants, and discussing how the model can be elaborated and improved
Incorporating Memory Processes in the Study of Early Language Acquisition
Critical to the learning of any language is the learning of the words in that language. Therefore, an extensive amount of research in language development has examined how infants learn the words of their language so rapidly. In particular, research on statistical learning has suggested that sequential statistics may play a vital role in the discovery of candidate words, that become available to be mapped to meaning. One important limitation of this previous research is the lack of attention given to the memory processes involved in statistical word learning. Thus, the current set of experiments examine the availability of statistically defined words as object labels after a delay. To examine whether statistics found in speech supports infantsâ memory for label-object associations, in Experiment 1, 22- to 24-month-old infants were presented with 12 Italian sentences that contained 2 high transitional probability words (HTP) and 2 low transitional probability words (LTP). Ten-minute after familiarization, using a Looking-While-Listening procedure (Fernald et al., 2008), infants were trained and tested on 2 HTP and 2 LTP label-object associations. Results revealed that infants were able to learn HTP but not LTP words, suggesting that HTP words make better labels for objects after a minimal delay. Experiment 2 examined infantsâ memory for meaning representations that are statistically defined or not. Stimuli and procedure were identical to that of Experiment 1, except that the 10-minute delay was implemented after the referent training phase instead of after the familiarization phase. Infants in Experiment 2 were able to remember both HTP and LTP words when tested following a 10-min delay. Together, the findings suggest that statistical learning facilitates future word learning
Non-adjacent dependency learning in infancy, and its link to language development
To acquire language, infants must learn how to identify words and linguistic structure in speech. Statistical learning has been suggested to assist both of these tasks. However, infantsâ capacity to use statistics to discover words and structure together remains unclear. Further, it is not yet known how infantsâ statistical learning ability relates to their language development. We trained 17-month-old infants on an artificial language comprising non-adjacent dependencies, and examined their looking times on tasks assessing sensitivity to words and structure using an eye-tracked head-turn-preference paradigm. We measured infantsâ vocabulary size using a Communicative Development Inventory (CDI) concurrently and at 19, 21, 24, 25, 27, and 30 months to relate performance to language development. Infants could segment the words from speech, demonstrated by a significant difference in looking times to words versus part-words. Infantsâ segmentation performance was significantly related to their vocabulary size (receptive and expressive) both currently, and over time (receptive until 24 months, expressive until 30 months), but was not related to the rate of vocabulary growth. The data also suggest infants may have developed sensitivity to generalised structure, indicating similar statistical learning mechanisms may contribute to the discovery of words and structure in speech, but this was not related to vocabulary size
The role of auditory perceptual gestalts on the processing of phrase structure
Hierarchical centre embeddings (HCEs) in natural language have been taken as evidence that language is not processed as a finite state system (Chomsky, 1957). While phrase structure may be necessary to produce HCEs, finite state, sequential processing may underlie their comprehension (Frank, Bod, & Christiansen, 2012). Under this account, listeners employ surface level cues (e.g. semantic content) to determine the dependencies within an utterance, instead of processing the words in a hierarchy. The acoustic structure of speech reflects the speakerâs syntactic representation during production (Cooper, Paccia & Lapointe, 1978). In comprehension, temporal (Snedeker & Trueswell, 2003) and pitch (Watson, Tanenhaus, & Gunlogson, 2008) cues rapidly influence processing. Therefore, temporal and pitch variation in speech could contain cues to dependencies. We examine whether grouping behaviour may be driven by Gestalt principles. Temporal proximity suggests that individuals group sequential words that occur closer together in time. Pitch similarity states that individuals group sequential words that are similar in pitch. In this thesis, I examine whether these Gestalts support dependency detection in speech, providing a mechanism through which hierarchical structure can be processed non-hierarchically. In Chapter 3, we assessed whether temporal proximity and pitch similarity explicitly relate to the structure of a corpus of spontaneously produced active and passive relative clauses. This was the case for actives; the embedded clause was preceded by a lengthened pause and a large pitch reduction. For passives, a longer pause and pitch reduction occurred after the verb-phrase of the embedded clause, counter to prediction. The results for actives suggest that temporal proximity and pitch similarity cues could be used to group the phrases of the embedded clause, obviating the need to process hierarchically structured speech hierarchically. Two artificial grammar learning studies assessed whether pitch similarity and temporal proximity cues support the acquisition of phrase structure grammar. Chapter 4 emphasised temporal proximity cues, while chapter 5 emphasised pitch similarity cues. In Chapter 5, pitch similarity cues improved classification performance for structures with two levels of embedding. In both, participants did not benefit from temporal proximity cues. However, the results of a cross-species meta-analysis of artificial grammar learning studies (Chapter 2) raised the possibility that reflection-based measures (e.g. grammaticality judgements) are not well suited for assessing processing-based learning, such as online speech processing (Christiansen, 2018). To properly assess the role of Gestalt cues in speech processing therefore requires processing-based measures. To assess the influence of auditory Gestalts on online speech processing, in Chapter 6 we analysed participantsâ gaze behaviour in response to pitch similarity and temporal proximity cues using the visual world paradigm. Participants heard speech-synthesised active-object and passive relative clauses, whilst viewing four potential targets. Each sentence had a prosodic structure consistent with either syntactic form (Chapter 3), or two control prosodic structures. Pitch similarity results indicated that these cues facilitated processing. Temporal proximity cues consistent with syntactic structure did not facilitate processing, instead results suggested a general benefit of increased processing time. Overall, these studies suggest that participants can use the pitch similarity Gestalt to group together syntactically dependent phrases in hierarchical speech, offering a mechanism through which individuals could process hierarchical structures non-hierarchically. The results of Chapters 4, 5, and 6 suggest temporal proximity cues did not facilitate performance to the same extent. Thus, we suggest that unfilled pauses in isolation may be insufficient to facilitate groupings on the basis of temporal proximity
Connectionist modelling of lexical segmentation and vocabulary acquisition
Adults typically hear sentences in their native language as a sequence of separate words and we might therefeore assume, that words in speech are physically separated in the way that they are perceived. However, when listening to an unfamiliar language we no longer experience sequences of discrete words, but rather hear a continuous stream of speech with boundaries separating individual sentences or utterances. Theories of how adult listeners segment the speech stream into words emphasise the role that knowledge of individual words plays in the segmentation of speech. However, since words can not be learnt until the speech stream can be segmented, it seems unlikely that infants will be able to use word recognition to segment connected speech. For this reason, researchers have proposed a variety of strategies and cues that infants could use to identify word boundaries without being able to recognise the words that these boundaries delimit. This chapter, describes some computational simulations proposing ways in which these cues and strategies for the acquisition of lexical segmentation can be integrated with the infants acquisition of the meanings of words. The simulations reported here describe simple computational mechanisms and knowledge sources that may support these different aspects of language acquisition
Domain-specificity in the Acquisition of Non-adjacent Dependencies
At the forefront of investigations into the cognitive underpinnings of language acquisition is the question of domain-specificity, i.e. whether the processes involved in learning language are unique to language. Recent investigations suggest that the mechanisms employed in language learning are also involved in sequential learning of non-linguistic stimuli and are therefore domain-general.
Non-adjacent dependencies are an important feature of natural languages. They describe relationships between two elements separated by an arbitrary number of intervening items, and thus potentially pose a challenge for learners. As a hallmark of natural languages they are ubiquitous, an example from English being subject-verb agreement: The socks on the floor are red. Here, learners are required to track the dependencies amongst the two underlined elements across an intervening prepositional phrase. Importantly, it has been shown that non-adjacent dependencies can be learned in the linguistic (GĂłmez, 2002) and non-linguistic (Creel, Newport & Aslin, 2004) domain.
The majority of work presented in this thesis is based on GĂłmezâs (2002) artificial language learning experiment involving non-adjacent dependencies, adapted to directly compare adultsâ learning in the linguistic and non-linguistic domain, in order to build a comprehensive map showing factors and conditions that enhance/ inhibit the learnability of non-adjacencies. Experiment 1 shows that the Gestalt Principle of Similarity is not a requirement for the detection of non-adjacent dependencies in the linguistic domain. Experiment 2 aims to explore the robustness of the ability to track non-adjacent regularities between linguistic elements by removing cues that indicate the correct level of analysis (i.e. interword breaks). Experiments 3 and 4 study domain-specificity in the acquisition of non-adjacencies, and show that non-adjacent dependencies are learnable in the linguistic and nonlinguistic domain, provided that the non-linguistic materials are simple and lacking internal structure. However, language is rich in internal structure: it is combinatorial on the phonemic/ orthographic level in that it recombines elements (phonemes/graphemes) to form larger units. When exposed to non-linguistic stimuli which capture this componential character of language, adult participants fail to detect the non-adjacencies. However, when exposed to non-componential non-linguistic materials, adult participants succeed in learning the non-adjacent dependencies. Experiment 5 looks at modality effects in the acquisition of non-adjacent dependencies across the linguistic and non-linguistic domain. Experiment 6 provides evidence that high familiarity with componential non-linguistic patterns does not result in the correct extraction of non-adjacencies in sequence learning tasks involving these patterns.
Overall, the work presented here demonstrates that the acquisition of nonadjacent dependencies is a domain-general ability, which is guided by stimulus simplicity
Early word learning through communicative inference
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 109-122).How do children learn their first words? Do they do it by gradually accumulating information about the co-occurrence of words and their referents over time, or are words learned via quick social inferences linking what speakers are looking at, pointing to, and talking about? Both of these conceptions of early word learning are supported by empirical data. This thesis presents a computational and theoretical framework for unifying these two different ideas by suggesting that early word learning can best be described as a process of joint inferences about speakers' referential intentions and the meanings of words. Chapter 1 describes previous empirical and computational research on "statistical learning"--the ability of learners to use distributional patterns in their language input to learn about the elements and structure of language-and argues that capturing this abifity requires models of learning that describe inferences over structured representations, not just simple statistics. Chapter 2 argues that social signals of speakers' intentions, even eye-gaze and pointing, are at best noisy markers of reference and that in order to take advantage of these signals fully, learners must integrate information across time. Chapter 3 describes the kinds of inferences that learners can make by assuming that speakers are informative with respect to their intended meaning, introducing and testing a formalization of how Grice's pragmatic maxims can be used for word learning. Chapter 4 presents a model of cross-situational intentional word learning that both learns words and infers speakers' referential intentions from labeled corpus data.by Michael C. Frank.Ph.D
Learning to predict or predicting to learn?
Humans complete complex commonplace tasks, such as understanding sentences, with striking speed and accuracy. This expertise is dependent on anticipation: predicting upcoming words gets us ahead of the game. But how do we master the game in the first place? To make accurate predictions, children must first learn their language. One possibility is that prediction serves double duty, enabling rapid language learning as well as understanding. Children could master the structures of their language by predicting how speakers will behave and, when those guesses are wrong, revising their linguistic representations. A number of prominent computational models assume that children learn in this way. But is that assumption correct? Here, we lay out the requirements for showing that children use âpredictive learningâ, and review the current evidence for this position. We argue that, despite widespread enthusiasm for the idea, we cannot yet conclude that children âpredict to learnâ
- âŠ