43 research outputs found

    Developing a cross-cultural annotation system and metacorpus for studying infants' real world language experience

    Get PDF
    Recent issues around reproducibility, best practices, and cultural bias impact naturalistic observational approaches as much as experimental approaches, but there has been less focus onthis area. Here, we present a new approach that leverages cross-laboratory collaborative, interdisciplinary efforts to examine important psychological questions. We illustrate this approach with a particular project that examines similarities and differences in children's early experiences with language. This project develops a comprehensive start-to-finish analysis pipeline by developing a flexible and systematic annotation system, and implementing this system across a sampling from a metacorpus of audiorecordings of diverse language communities. This resource is publicly available for use, sensitive to cultural differences, and flexible to address a variety of research questions. It is also uniquely suited for use in the development of tools for automated analysis.Fil: Soderstrom, Melanie. University of Manitoba; CanadáFil: Casillas, Marisa. University of Chicago; Estados UnidosFil: Bergelson, Elika. University of Duke; Estados UnidosFil: Rosemberg, Celia Renata. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Saavedra 15. Centro Interdisciplinario de Investigaciones en Psicología Matemática y Experimental Dr. Horacio J. A. Rimoldi; ArgentinaFil: Alam, Florencia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Saavedra 15. Centro Interdisciplinario de Investigaciones en Psicología Matemática y Experimental Dr. Horacio J. A. Rimoldi; ArgentinaFil: Warlaumont, Anne S.. University of California at Los Angeles; Estados UnidosFil: Bunce, John. California State University; Estados Unido

    Language origins viewed in spontaneous and interactive vocal rates of human and bonobo infants

    Get PDF
    Funding: The research for this manuscript was funded by Grants R01 DC006099, DC011027, and DC015108 from the National Institute on Deafness and Other Communication Disorders, by NSF BCS-1529127, by ERC-Synergy grant SOMICS 609819, by the James S. McDonnell Foundation, and by the Plough Foundation.From the first months of life, human infants produce “protophones,” speech-like, non-cry sounds, presumed absent, or only minimally present in other apes. But there have been no direct quantitative comparisons to support this presumption. In addition, by 2 months, human infants show sustained face-to-face interaction using protophones, a pattern thought also absent or very limited in other apes, but again, without quantitative comparison. Such comparison should provide evidence relevant to determining foundations of language, since substantially flexible vocalization, the inclination to explore vocalization, and the ability to interact socially by means of vocalization are foundations for language. Here we quantitatively compare data on vocalization rates in three captive bonobo (Pan paniscus) mother–infant pairs with various sources of data from our laboratories on human infant vocalization. Both humans and bonobos produced distress sounds (cries/screams) and laughter. The bonobo infants also produced sounds that were neither screams nor laughs and that showed acoustic similarities to the human protophones. These protophone-like sounds confirm that bonobo infants share with humans the capacity to produce vocalizations that appear foundational for language. Still, there were dramatic differences between the species in both quantity and function of the protophone and protophone-like sounds. The bonobo protophone-like sounds were far less frequent than the human protophones, and the human protophones were far less likely to be interpreted as complaints and more likely as vocal play. Moreover, we found extensive vocal interaction between human infants and mothers, but no vocal interaction in the bonobo mother–infant pairs—while bonobo mothers were physically responsive to their infants, we observed no case of a bonobo mother vocalization directed to her infant. Our cross-species comparison focuses on low- and moderate-arousal circumstances because we reason the roots of language entail vocalization not triggered by excitement, for example, during fighting or intense play. Language appears to be founded in flexible vocalization, used to regulate comfortable social interaction, to share variable affective states at various levels of arousal, and to explore vocalization itself.Publisher PDFPeer reviewe

    Language Origins Viewed in Spontaneous and Interactive Vocal Rates of Human and Bonobo Infants

    Get PDF
    From the first months of life, human infants produce “protophones,” speech-like, non-cry sounds, presumed absent, or only minimally present in other apes. But there have been no direct quantitative comparisons to support this presumption. In addition, by 2 months, human infants show sustained face-to-face interaction using protophones, a pattern thought also absent or very limited in other apes, but again, without quantitative comparison. Such comparison should provide evidence relevant to determining foundations of language, since substantially flexible vocalization, the inclination to explore vocalization, and the ability to interact socially by means of vocalization are foundations for language. Here we quantitatively compare data on vocalization rates in three captive bonobo (Pan paniscus) mother–infant pairs with various sources of data from our laboratories on human infant vocalization. Both humans and bonobos produced distress sounds (cries/screams) and laughter. The bonobo infants also produced sounds that were neither screams nor laughs and that showed acoustic similarities to the human protophones. These protophone-like sounds confirm that bonobo infants share with humans the capacity to produce vocalizations that appear foundational for language. Still, there were dramatic differences between the species in both quantity and function of the protophone and protophone-like sounds. The bonobo protophone-like sounds were far less frequent than the human protophones, and the human protophones were far less likely to be interpreted as complaints and more likely as vocal play. Moreover, we found extensive vocal interaction between human infants and mothers, but no vocal interaction in the bonobo mother–infant pairs—while bonobo mothers were physically responsive to their infants, we observed no case of a bonobo mother vocalization directed to her infant. Our cross-species comparison focuses on low- and moderate-arousal circumstances because we reason the roots of language entail vocalization not triggered by excitement, for example, during fighting or intense play. Language appears to be founded in flexible vocalization, used to regulate comfortable social interaction, to share variable affective states at various levels of arousal, and to explore vocalization itself

    Caregivers\u27 suffix frequencies and suffix acquisition by language impaired, late talking, and typically developing children

    No full text
    Acquisition of regular inflectional suffixes is an integral part of grammatical development in English and delayed acquisition of certain inflectional suffixes is a hallmark of language impairment. We investigate the relationship between input frequency and grammatical suffix acquisition, analyzing 217 transcripts of mother-child (ages 1 ; 11-6 ; 9) conversations from the CHILDES database. Maternal suffix frequency correlates with previously reported rank orders of acquisition and with child suffix frequency. Percentages of children using a suffix are consistent with frequencies in caregiver speech. Although late talkers acquire suffixes later than typically developing children, order of acquisition is similar across populations. Furthermore, the third person singular and past tense verb suffixes, weaknesses for children with language impairment, are less frequent in caregiver speech than the plural noun suffix, a relative strength in language impairment. Similar findings hold across typical, SLI and late talker populations, suggesting that frequency plays a role in suffix acquisition. © Copyright © Cambridge University Press 2011

    Evolution of reflexive signals using a realistic vocal tract model

    No full text
    We introduce a model of the evolution of reflexive primate signals that incorporates a realistic vocal tract model for generating the signals. Signaler neural networks receive signal types as inputs and produce vocal tract muscle activations as outputs. These muscle activations are input to a model of the primate vocal tract, generating real sounds. Receiver neural networks receive spectrograms of these sounds as inputs and produce signal type classifications as outputs. Incorporating a realistic vocal tract has a substantial effect on the types of signals that can evolve. Compared to a model with abstract signals, the realistic model signals are more similar and have more correlated elements. The realistic, embodied model also exhibits more variability in rate of adaptation, usually adapting more slowly. This may be explained by the more jagged fitness landscapes in the realistic model. The realistic signals also tend to be quiet. Environmental noise results in louder signals but makes the evolutionary process even slower and less robust. These results indicate that signal evolution with a more realistic genotype–phenotype mapping can differ substantially from evolution with abstract signals. Including realistic signal generation mechanisms may enable computational models to provide greater insights into natural signal evolution

    Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity.

    No full text
    At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant's nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model's frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one's own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop in infancy but also for our understanding of how they may have evolved

    Self-production facilitates and adult input interferes in a neural network model of infant vowel imitation

    No full text
    It is well known that greater amounts of adult input facilitate a child\u27s language development. Thus, one might expect that increased amounts of adult input would help an infant learn to accurately imitate the vowels of his/her native language. In addition, an infant\u27s own production of sounds during cooing, babbling, etc. is known to be important to the development of speech abilities. We simulate infant vowel development using a neural network that contains a layer of auditory neurons, a layer of motor neurons, and bidirectional connections linking these perceptual and motor layers. During an initial babbling phase, the system produces random motor activations, hears the acoustic consequences of these motor activations, and adjusts the weights between its auditory and motor layers in a Hebbian fashion. In simulations, passive auditory input from an external caregiver is also included during the babbling phase, and is used to update existing auditory-motor connections. In a testing phase, the model is given adult vowels as auditory input and asked to imitate them. Results indicate that self-productions do promote the development of the ability to imitate, but, somewhat counterintuitively, the more adult input this model receives during babbling, the less accurate its imitations are during test. Explanations and implications of this finding are discussed.12

    Vocal Development as a Guide to Modeling the Evolution of Language

    No full text
    Modeling of evolution and development of language has principally utilized mature units of spoken language, phonemes and words, as both targets and inputs. This approach cannot address the earliest phases of development because young infants are unable to produce such language features. We argue that units of early vocal development-protophones and their primitive illocutionary/perlocutionary forces-should be targeted in evolutionary modeling because they suggest likely units of hominin vocalization/communication shortly after the split from the chimpanzee/bonobo lineage, and because early development of spontaneous vocal capability is a logically necessary step toward vocal language, a root capability without which other crucial steps toward vocal language capability are impossible. Modeling of language evolution/development must account for dynamic change in early communicative units of form/function across time. We argue for interactive contributions of sender/infants and receiver/caregivers in a feedback loop involving both development and evolution and propose to begin computational modeling at the hominin break from the primate communicative background

    Synaptic weights after learning.

    No full text
    <p>A: Example of the synapse strengths from each reservoir output neuron to each motor neuron after learning. The left plot shows the synapses for the first simulation of the 200 motor neuron <i>m</i> = 2 model reinforced for high-salience vocalizations. The right plot shows the synapses for the corresponding yoked control simulation. Yellow indicates greater connection strengths; blue indicates weaker synapses. The stronger synapses on the left half of the left plot as compared to the right half of that same plot reflect the greater connection of reservoir neurons to agonist motor neurons promoting mouth closure than to antagonist motor neurons promoting mouth opening. Note that this bias is not present in the connection weights of the yoked control simulation shown on the right. B: Across all simulations of the 200 motor neuron <i>m</i> = 2 model, the total strength of the connections from the reservoir to the agonist motor neurons divided by the total strength of the connections from the reservoir to the antagonist motor neurons. Bar height indicates the mean across the five simulations and the error bars represent 95% confidence intervals. C: Across all simulations of the 200 motor neuron <i>m</i> = 2 model, the standard deviation of the connection strengths from the reservoir to the motor neurons. Bar height indicates the mean standard deviation across the five simulations.</p

    Overview of the model.

    No full text
    <p>A: Schematic depiction of the groups of neurons in the spiking neural network and how they are connected. There is a reservoir of 1000 recurrently connected neurons, with 200 of those being inhibitory (red) and the rest excitatory (blue and black). 200 of the reservoir’s excitatory neurons are designated as output neurons (black). These output neurons connect to two groups of motor neurons, agonist motor neurons (blue) and antagonist motor neurons (red). The connection weights within the reservoir are set at the start of the simulation to random values and do not change over the course of the simulation. The connection weights from the reservoir output neurons to the motor neurons are initially set to random values and are modified throughout the simulation by dopamine (DA)-modulated STDP. All reservoir and motor neurons receive random input current at each time step (not shown). B: Raster plot of spikes in the reservoir over a 1 s time period. C: Raster plot of spikes in the motor neuron groups over the same 1 s time period. The agonist and antagonist motor neuron spikes are summed at each time step then are smoothed using a 100 ms moving average. The smoothed antagonist activity is subtracted from the smoothed agonist activity, creating a net smoothed muscle activity that is sent to the orbicularis and masseter muscles. D: The smoothed agonist, antagonist, and net activity for the same 1 s as in the raster plots. E: Effects of the orbicularis oris and masseter on the vocal tract’s shape (reprinted with permission from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0145096#pone.0145096.ref061" target="_blank">61</a>]). Orbicularis oris activity tends to round and close the lips and masseter activity tends to raise the jaw. F: Schematic illustration that the vocal tract is modeled as an air-filled tube bounded by walls made up of coupled mass-spring systems (reprinted with permission from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0145096#pone.0145096.ref061" target="_blank">61</a>]). The orbicularis oris and masseter affect the equilibrium positions at the front parts of the tube. The air pressure over time and space in the tube is calculated, and the air pressure at the lip end of the tube forms the sound waveform. The vocal tract shape is modeled more realistically than depicted here and also contains a nasal cavity that is not depicted. G: The sound synthesized by the vocal tract model is input to an algorithm that estimates auditory salience. The plot shows, for the same 1 s as in B–D, the synthesized vocalization waveform (in cyan) and the salience of that waveform over time (in black). Apart from a peak in salience at the sound’s onset, the most salient portion of the sound is around the place where the sound’s one consonant can be heard. The overall salience of this particular sound is 10.77. If the salience of the sound is above the model’s current threshold, a reward is given, which causes an increase in dopamine concentration in the neural network.</p
    corecore