312 research outputs found

    The evolution of auditory contrast

    Get PDF
    This paper reconciles the standpoint that language users do not aim at improving their sound systems with the observation that languages seem to improve their sound systems. Computer simulations of inventories of sibilants show that Optimality-Theoretic learners who optimize their perception grammars automatically introduce a so-called prototype effect, i.e. the phenomenon that the learner’s preferred auditory realization of a certain phonological category is more peripheral than the average auditory realization of this category in her language environment. In production, however, this prototype effect is counteracted by an articulatory effect that limits the auditory form to something that is not too difficult to pronounce. If the prototype effect and the articulatory effect are of a different size, the learner must end up with an auditorily different sound system from that of her language environment. The computer simulations show that, independently of the initial auditory sound system, a stable equilibrium is reached within a small number of generations. In this stable state, the dispersion of the sibilants of the language strikes an optimal balance between articulatory ease and auditory contrast. The important point is that this is derived within a model without any goal-oriented elements such as dispersion constraints

    Adverse conditions improve distinguishability of auditory, motor and perceptuo-motor theories of speech perception: an exploratory Bayesian modeling study

    Get PDF
    Special Issue: Speech Recognition in Adverse ConditionsInternational audienceIn this paper, we put forward a computational framework for the comparison between motor, auditory, and perceptuo-motor theories of speech communication. We first recall the basic arguments of these three sets of theories, either applied to speech perception or to speech production. Then we expose a unifying Bayesian model able to express each theory in a probabilistic way. Focusing on speech perception, we demonstrate that under two hypotheses, regarding communication noise and inter-speaker variability, providing perfect conditions for speech communication, motor, and auditory theories are indistinguishable. We then degrade successively each hypothesis to study the distinguish- ability of the different theories in ''adverse'' conditions. We first present simulations on a simplified implementation of the model with mono-dimensional sensory and motor variables, and secondly we consider a simulation of the human vocal tract providing more realistic auditory and articulatory variables. Simulation results allow us to emphasise the respective roles of motor and auditory knowledge in various conditions of speech perception in adverse conditions, and to suggest some guidelines for future studies aiming at assessing the role of motor knowledge in speech perception

    Functional and spatial segregation within the inferior frontal and superior temporal cortices during listening, articulation imagery, and production of vowels

    Get PDF
    Abstract Classical models of language localize speech perception in the left superior temporal and production in the inferior frontal cortex. Nonetheless, neuropsychological, structural and functional studies have questioned such subdivision, suggesting an interwoven organization of the speech function within these cortices. We tested whether sub-regions within frontal and temporal speech-related areas retain specific phonological representations during both perception and production. Using functional magnetic resonance imaging and multivoxel pattern analysis, we showed functional and spatial segregation across the left fronto-temporal cortex during listening, imagery and production of vowels. In accordance with classical models of language and evidence from functional studies, the inferior frontal and superior temporal cortices discriminated among perceived and produced vowels respectively, also engaging in the non-classical, alternative function – i.e. perception in the inferior frontal and production in the superior temporal cortex. Crucially, though, contiguous and non-overlapping sub-regions within these hubs performed either the classical or non-classical function, the latter also representing non-linguistic sounds (i.e., pure tones). Extending previous results and in line with integration theories, our findings not only demonstrate that sensitivity to speech listening exists in production-related regions and vice versa, but they also suggest that the nature of such interwoven organisation is built upon low-level perception

    Perceptuo-motor biases in the perceptual organization of the height feature in French vowels

    No full text
    A paraître dans Acta AcusticaInternational audienceThis paper reports on the organization of the perceived vowel space in French. In a previous paper [28], we investigated the implementation of vocal height contrasts along the F1 dimension in French speakers. In this paper, we present results from perceptual identification tests performed by twelve participants who took part in the production experiment reported in the earlier paper. For each subject, stimuli presented in the identification test were synthesized in two different vowel spaces, corresponding to two different vocal tract lengths. The results showed that first, the perceived French vowels belonging to similar height degrees were aligned on stable F1 values, independent of place of articulation and roundedness, as was the case for produced vowels. Second, the produced F1 distances between height degrees correlated with the perceived F1 distances. This suggests that there is a link between perceptual and motor phonemic prototypes in the human brain. The results are discussed using the framework of the Perception for Action Control (PACT) theory, in which speech units are considered to be gestures shaped by perceptual processes

    Functional and spatial segregation within the inferior frontal and superior temporal cortices during listening, articulation imagery, and production of vowels

    Get PDF
    Classical models of language localize speech perception in the left superior temporal and production in the inferior frontal cortex. Nonetheless, neuropsychological, structural and functional studies have questioned such subdivision, suggesting an interwoven organization of the speech function within these cortices. We tested whether sub-regions within frontal and temporal speech-related areas retain specific phonological representations during both perception and production. Using functional magnetic resonance imaging and multivoxel pattern analysis, we showed functional and spatial segregation across the left fronto-temporal cortex during listening, imagery and production of vowels. In accordance with classical models of language and evidence from functional studies, the inferior frontal and superior temporal cortices discriminated among perceived and produced vowels respectively, also engaging in the non-classical, alternative function - i.e. perception in the inferior frontal and production in the superior temporal cortex. Crucially, though, contiguous and non-overlapping sub-regions within these hubs performed either the classical or non-classical function, the latter also representing non-linguistic sounds (i.e., pure tones). Extending previous results and in line with integration theories, our findings not only demonstrate that sensitivity to speech listening exists in production-related regions and vice versa, but they also suggest that the nature of such interwoven organisation is built upon low-level perception

    Multivariate pattern analysis of input and output representations of speech

    Get PDF
    Repeating a word or nonword requires a speaker to map auditory representations of incoming sounds onto learned speech items, maintain those items in short-term memory, interface that representation with the motor output system, and articulate the target sounds. This dissertation seeks to clarify the nature and neuroanatomical localization of speech sound representations in perception and production through multivariate analysis of neuroimaging data. The major portion of this dissertation describes two experiments using functional magnetic resonance imaging (fMRI) to measure responses to the perception and overt production of syllables and multivariate pattern analysis to localize brain areas containing associated phonological/phonetic information. The first experiment used a delayed repetition task to permit response estimation for auditory syllable presentation (input) and overt production (output) in individual trials. In input responses, clusters sensitive to vowel identity were found in left inferior frontal sulcus (IFs), while clusters responsive to syllable identity were found in left ventral premotor cortex and left mid superior temporal sulcus (STs). Output-linked responses revealed clusters of vowel information bilaterally in mid/posterior STs. The second experiment was designed to dissociate the phonological content of the auditory stimulus and vocal target. Subjects were visually presented with two (non)word syllables simultaneously, then aurally presented with one of the syllables. A visual cue informed subjects either to repeat the heard syllable (repeat trials) or produce the unheard, visually presented syllable (change trials). Results suggest both IFs and STs represent heard syllables; on change trials, representations in frontal areas, but not STs, are updated to reflect the vocal target. Vowel identity covaries with formant frequencies, inviting the question of whether lower-level, auditory representations can support vowel classification in fMRI. The final portion of this work describes a simulation study, in which artificial fMRI datasets were constructed to mimic the overall design of Experiment 1 with voxels assumed to contain either discrete (categorical) or analog (frequency-based) vowel representations. The accuracy of classification models was characterized by type of representation and the density and strength of responsive voxels. It was shown that classification is more sensitive to sparse, discrete representations than dense analog representations

    Sensitivity of human auditory cortex to rapid frequency modulation revealed by multivariate representational similarity analysis.

    Get PDF
    Functional Magnetic Resonance Imaging (fMRI) was used to investigate the extent, magnitude, and pattern of brain activity in response to rapid frequency-modulated sounds. We examined this by manipulating the direction (rise vs. fall) and the rate (fast vs. slow) of the apparent pitch of iterated rippled noise (IRN) bursts. Acoustic parameters were selected to capture features used in phoneme contrasts, however the stimuli themselves were not perceived as speech per se. Participants were scanned as they passively listened to sounds in an event-related paradigm. Univariate analyses revealed a greater level and extent of activation in bilateral auditory cortex in response to frequency-modulated sweeps compared to steady-state sounds. This effect was stronger in the left hemisphere. However, no regions showed selectivity for either rate or direction of frequency modulation. In contrast, multivoxel pattern analysis (MVPA) revealed feature-specific encoding for direction of modulation in auditory cortex bilaterally. Moreover, this effect was strongest when analyses were restricted to anatomical regions lying outside Heschl\u27s gyrus. We found no support for feature-specific encoding of frequency modulation rate. Differential findings of modulation rate and direction of modulation are discussed with respect to their relevance to phonetic discrimination

    The Self-Organization of Speech Sounds

    Get PDF
    The speech code is a vehicle of language: it defines a set of forms used by a community to carry information. Such a code is necessary to support the linguistic interactions that allow humans to communicate. How then may a speech code be formed prior to the existence of linguistic interactions? Moreover, the human speech code is discrete and compositional, shared by all the individuals of a community but different across communities, and phoneme inventories are characterized by statistical regularities. How can a speech code with these properties form? We try to approach these questions in the paper, using the ``methodology of the artificial''. We build a society of artificial agents, and detail a mechanism that shows the formation of a discrete speech code without pre-supposing the existence of linguistic capacities or of coordinated interactions. The mechanism is based on a low-level model of sensory-motor interactions. We show that the integration of certain very simple and non language-specific neural devices leads to the formation of a speech code that has properties similar to the human speech code. This result relies on the self-organizing properties of a generic coupling between perception and production within agents, and on the interactions between agents. The artificial system helps us to develop better intuitions on how speech might have appeared, by showing how self-organization might have helped natural selection to find speech

    Connectionist perspectives on language learning, representation and processing.

    Get PDF
    The field of formal linguistics was founded on the premise that language is mentally represented as a deterministic symbolic grammar. While this approach has captured many important characteristics of the world\u27s languages, it has also led to a tendency to focus theoretical questions on the correct formalization of grammatical rules while also de-emphasizing the role of learning and statistics in language development and processing. In this review we present a different approach to language research that has emerged from the parallel distributed processing or \u27connectionist\u27 enterprise. In the connectionist framework, mental operations are studied by simulating learning and processing within networks of artificial neurons. With that in mind, we discuss recent progress in connectionist models of auditory word recognition, reading, morphology, and syntactic processing. We argue that connectionist models can capture many important characteristics of how language is learned, represented, and processed, as well as providing new insights about the source of these behavioral patterns. Just as importantly, the networks naturally capture irregular (non-rule-like) patterns that are common within languages, something that has been difficult to reconcile with rule-based accounts of language without positing separate mechanisms for rules and exceptions
    corecore