Search CORE

1,190 research outputs found

The Self-Organization of Speech Sounds

Author: Oudeyer Pierre-Yves
Publication venue: Elsevier
Publication date: 01/01/2005
Field of study

The speech code is a vehicle of language: it defines a set of forms used by a community to carry information. Such a code is necessary to support the linguistic interactions that allow humans to communicate. How then may a speech code be formed prior to the existence of linguistic interactions? Moreover, the human speech code is discrete and compositional, shared by all the individuals of a community but different across communities, and phoneme inventories are characterized by statistical regularities. How can a speech code with these properties form? We try to approach these questions in the paper, using the ``methodology of the artificial''. We build a society of artificial agents, and detail a mechanism that shows the formation of a discrete speech code without pre-supposing the existence of linguistic capacities or of coordinated interactions. The mechanism is based on a low-level model of sensory-motor interactions. We show that the integration of certain very simple and non language-specific neural devices leads to the formation of a speech code that has properties similar to the human speech code. This result relies on the self-organizing properties of a generic coupling between perception and production within agents, and on the interactions between agents. The artificial system helps us to develop better intuitions on how speech might have appeared, by showing how self-organization might have helped natural selection to find speech

arXiv.org e-Print Archive

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

The self-organization of combinatoriality and phonotactics in vocalization systems

Author: Oudeyer Dr. P-Y.
Publication venue
Publication date: 01/01/2005
Field of study

This paper shows how a society of agents can self-organize a shared vocalization system that is discrete, combinatorial and has a form of primitive phonotactics, starting from holistic inarticulate vocalizations. The originality of the system is that: (1) it does not include any explicit pressure for communication; (2) agents do not possess capabilities of coordinated interactions, in particular they do not play language games; (3) agents possess no specific linguistic capacities; and (4) initially there exists no convention that agents can use. As a consequence, the system shows how a primitive speech code may bootstrap in the absence of a communication system between agents, i.e. before the appearance of language

CogPrints Cognitive Sciences Eprint Archive

From Analogue to Digital Vocalizations

Author: Oudeyer Dr. Pierre-Yves
Publication venue: Oxford University Press
Publication date: 01/01/2003
Field of study

Sound is a medium used by humans to carry information. The existence of this kind of medium is a pre-requisite for language. It is organized into a code, called speech, which provides a repertoire of forms that is shared in each language community. This code is necessary to support the linguistic interactions that allow humans to communicate. How then may a speech code be formed prior to the existence of linguistic interactions? Moreover, the human speech code is characterized by several properties: speech is digital and compositional (vocalizations are made of units re-used systematically in other syllables); phoneme inventories have precise regularities as well as great diversity in human languages; all the speakers of a language community categorize sounds in the same manner, but each language has its own system of categorization, possibly very different from every other. How can a speech code with these properties form? These are the questions we will approach in the paper. We will study them using the method of the artificial. We will build a society of artificial agents, and study what mechanisms may provide answers. This will not prove directly what mechanisms were used for humans, but rather give ideas about what kind of mechanism may have been used. This allows us to shape the search space of possible answers, in particular by showing what is sufficient and what is not necessary. The mechanism we present is based on a low-level model of sensory-motor interactions. We show that the integration of certain very simple and non language-specific neural devices allows a population of agents to build a speech code that has the properties mentioned above. The originality is that it pre-supposes neither a functional pressure for communication, nor the ability to have coordinated social interactions (they do not play language or imitation games). It relies on the self-organizing properties of a generic coupling between perception and production both within agents, and on the interactions between agents

CogPrints Cognitive Sciences Eprint Archive

Functional organization of human sensorimotor cortex for speech articulation.

Author: Bouchard Kristofer E
Chang Edward F
Johnson Keith
Mesgarani Nima
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Speaking is one of the most complex actions that we perform, but nearly all of us learn to do it effortlessly. Production of fluent speech requires the precise, coordinated movement of multiple articulators (for example, the lips, jaw, tongue and larynx) over rapid time scales. Here we used high-resolution, multi-electrode cortical recordings during the production of consonant-vowel syllables to determine the organization of speech sensorimotor cortex in humans. We found speech-articulator representations that are arranged somatotopically on ventral pre- and post-central gyri, and that partially overlap at individual electrodes. These representations were coordinated temporally as sequences during syllable production. Spatial patterns of cortical activity showed an emergent, population-level representation, which was organized by phonetic features. Over tens of milliseconds, the spatial patterns transitioned between distinct representations for different consonants and vowels. These results reveal the dynamic organization of speech sensorimotor cortex during the generation of multi-articulator movements that underlies our ability to speak

CiteSeerX

Crossref

PubMed Central

eScholarship - University of California

Open challenges in understanding development and evolution of speech forms: The roles of embodied self-organization, motivation and active exploration

Author: Oudeyer Pierre-Yves
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

This article discusses open scientific challenges for understanding development and evolution of speech forms, as a commentary to Moulin-Frier et al. (Moulin-Frier et al., 2015). Based on the analysis of mathematical models of the origins of speech forms, with a focus on their assumptions , we study the fundamental question of how speech can be formed out of non--speech, at both developmental and evolutionary scales. In particular, we emphasize the importance of embodied self-organization , as well as the role of mechanisms of motivation and active curiosity-driven exploration in speech formation. Finally , we discuss an evolutionary-developmental perspective of the origins of speech

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Computational and Robotic Models of Early Language Development: A Review

Author: Kachergis George
Oudeyer Pierre-Yves
Schueller William
Publication venue
Publication date: 25/03/2019
Field of study

We review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments. Keywords: Early language learning, Computational and robotic models, machine learning, development, embodiment, social interaction, intrinsic motivation, self-organization, dynamical systems, complexity.Comment: to appear in International Handbook on Language Development, ed. J. Horst and J. von Koss Torkildsen, Routledg

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Re-conceptualising the Language Game Paradigm in the Framework of Multi-Agent Reinforcement Learning

Author: Beuls Katrien
Eecke Paul Van
Publication venue
Publication date: 09/04/2020
Field of study

In this paper, we formulate the challenge of re-conceptualising the language game experimental paradigm in the framework of multi-agent reinforcement learning (MARL). If successful, future language game experiments will benefit from the rapid and promising methodological advances in the MARL community, while future MARL experiments on learning emergent communication will benefit from the insights and results gained from language game experiments. We strongly believe that this cross-pollination has the potential to lead to major breakthroughs in the modelling of how human-like languages can emerge and evolve in multi-agent systems.Comment: This paper was accepted for presentation at the 2020 AAAI Spring Symposium `Challenges and Opportunities for Multi-Agent Reinforcement Learning' after a double-blind reviewing proces

arXiv.org e-Print Archive

Repository of the University of Namur

Re-conceptualising the Language Game Paradigm in the Framework of Multi-Agent Reinforcement Learning

Author: Beuls Katrien
Eecke Paul Van
Publication venue
Publication date: 09/04/2020
Field of study

Repository of the University of Namur

DISSOCIABLE MECHANISMS OF CONCURRENT SPEECH IDENTIFICATION IN NOISE AT CORTICAL AND SUBCORTICAL LEVELS.

Author: Yellamsetty Anusha
Publication venue: University of Memphis Digital Commons
Publication date: 01/01/2018
Field of study

When two vowels with different fundamental frequencies (F0s) are presented concurrently, listeners often hear two voices producing different vowels on different pitches. Parsing of this simultaneous speech can also be affected by the signal-to-noise ratio (SNR) in the auditory scene. The extraction and interaction of F0 and SNR cues may occur at multiple levels of the auditory system. The major aims of this dissertation are to elucidate the neural mechanisms and time course of concurrent speech perception in clean and in degraded listening conditions and its behavioral correlates. In two complementary experiments, electrical brain activity (EEG) was recorded at cortical (EEG Study #1) and subcortical (FFR Study #2) levels while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero and four semitones (STs) presented in either clean or noise degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in identifying both vowels for larger F0 separations (i.e., 4ST; with pitch cues), and this F0-benefit was more pronounced at more favorable SNRs. Time-frequency analysis of cortical EEG oscillations (i.e., brain rhythms) revealed a dynamic time course for concurrent speech processing that depended on both extrinsic (SNR) and intrinsic (pitch) acoustic factors. Early high frequency activity reflected pre-perceptual encoding of acoustic features (~200 ms) and the quality (i.e., SNR) of the speech signal (~250-350ms), whereas later-evolving low-frequency rhythms (~400-500ms) reflected post-perceptual, cognitive operations that covaried with listening effort and task demands. Analysis of subcortical responses indicated that while FFRs provided a high-fidelity representation of double vowel stimuli and the spectro-temporal nonlinear properties of the peripheral auditory system. FFR activity largely reflected the neural encoding of stimulus features (exogenous coding) rather than perceptual outcomes, but timbre (F1) could predict the speed in noise conditions. Taken together, results of this dissertation suggest that subcortical auditory processing reflects mostly exogenous (acoustic) feature encoding in stark contrast to cortical activity, which reflects perceptual and cognitive aspects of concurrent speech perception. By studying multiple brain indices underlying an identical task, these studies provide a more comprehensive window into the hierarchy of brain mechanisms and time-course of concurrent speech processing

University of Memphis Digital Commons

Sounding the body: the role of the Valsalva mechanism in the emergence of the linguistic sign

Author: Philps Dennis
Publication venue: Universite Clermont Auvergne
Publication date: 13/01/2020
Field of study

The main aim of this study, conducted within STEELS, a gestural theory of the origins of speech, is to set out a proposal as to the possible role of the Valsalva mechanism in the emergence of the linguistic sign. STEELS posits that in the earliest forms of speech developed by Homo, vocomimetic laryngeal resonances of nonlinguistic origin were integrated into LV (laryngeal + vowel) protosyllables referring back to oro-naso-laryngeal (ONL) actions such as breathing, sneezing and coughing. It further posits that these protosyllables were conceptually mapped to non-ONL bodily actions making use of the Valsalva manoeuvre, such as lifting, birthing, and defecating. This claim, which stems from a submorphemic analysis of certain Proto-Indo-European “body-part” roots projected back, within a gestural framework, to the emergence of speech, suggests that the vocomimetic protosyllables posited would have become (self-)referential through a neurocognitive process of recurrent, somatotopically-driven pattern-extraction.Le but principal de cette étude, menée dans le cadre de la TSG, théorie gestuelle des origines du langage articulé, est d’explorer les contours de l’éventuel rôle qu’a pu jouer le mécanisme de Valsalva dans l’émergence du signe linguistique. La TSG postule que dans les premières conformations du langage développées par Homo, des résonances laryngales à caractère vocomimétique d’origine non linguistique ont pu être incorporées dans des protosyllabes de type LV (laryngale + voyelle) renvoyant auto-référentiellement à des actions bucco-naso-laryngales (BNL) telles que respirer, éternuer ou tousser. Elle postule également que ces protosyllabes ont pu être projetées sur des actions corporelles autres que BNL faisant appel à la manœuvre de Valsalva, telles que soulever, enfanter ou déféquer. Cette affirmation, fondée sur une analyse submorphémique de certaines racines du proto-indo-européen renvoyant au corps, rétroprojetée dans une perspective gestuelle jusqu’à l’émergence du langage articulé, laisse penser que les protosyllabes vocomimétiques postulées seraient devenues (auto-)référentielles au moyen d’un processus neurocognitif impliquant l’extraction de schémas récurrents de traits formels somatotopiquement mu

Revues Clermont Université