5 research outputs found

    Formant Space Reconstruction From Brain Activity in Frontal and Temporal Regions Coding for Heard Vowels

    Get PDF
    Classical studies have isolated a distributed network of temporal and frontal areas engaged in the neural representation of speech perception and production. With modern literature arguing against unique roles for these cortical regions, different theories have favored either neural code-sharing or cortical space-sharing, thus trying to explain the intertwined spatial and functional organization of motor and acoustic components across the fronto-temporal cortical network. In this context, the focus of attention has recently shifted toward specific model fitting, aimed at motor and/or acoustic space reconstruction in brain activity within the language network. Here, we tested a model based on acoustic properties (formants), and one based on motor properties (articulation parameters), where model-free decoding of evoked fMRI activity during perception, imagery, and production of vowels had been successful. Results revealed that phonological information organizes around formant structure during the perception of vowels; interestingly, such a model was reconstructed in a broad temporal region, outside of the primary auditory cortex, but also in the pars triangularis of the left inferior frontal gyrus. Conversely, articulatory features were not associated with brain activity in these regions. Overall, our results call for a degree of interdependence based on acoustic information, between the frontal and temporal ends of the language network

    Estimation of vocal tract shape trajectory using lossy Kelly-Lochbaum model

    Get PDF
    On esitetty teorioita, joiden mukaan puheen ymmärtämistä helpottaa aikaisempi kokemus puheen tuottamisesta. Muuntamalla akustinen puhesignaali hypoteesiksi puhujan artikulaatioeleistä voidaan saavuttaa puhujariippumattomampi ja äänteitä paremmin erotteleva kuvaus puheesta. Tämä työ esittelee metodin, jolla ääntöväylän liikeratoja voidaan arvioida suoraan puhesignaaleista. Tässä työssä luodaan Kelly-Lochbaum-tyyppinen ääntöväylämalli käyttäen apuna puheentuottamisen teoriaa. Malli on varustettu huulisäteilyllä ja säädettävällä huulten pituudella. Mallia käyttäen luodaan hakutaulukko, joka kuvaa vastaavuuksia puheen hetkellisten spektriominaisuuksien ja artikulatoristen muotojen välillä. Hakutaulukkoa voidaan käyttää mappaukseen akustisen ja artikulatorisen avaruuden välillä. Luotua mallia käytetään ääntöväylän liikeratojen arvioinnissa jatkuvan puheen aikana. Liikeradat löydetään käyttämällä yksinkertaista optimointialgoritmia, joka estimoi liikeradan minimoimalla artikulaatioon kuluvaa energiaa.There are theories that during speech perception, the understanding of speech is boosted by the knowledge of the articulatory gestures based on former speech production experience. By transforming an acoustic speech signal into a hypothesis about the articulatory gestures of the speaker, it is possible to obtain a more accurate, speaker-independent description of speech. This thesis introduces a method of estimating vocal tract trajectories from speech signals. Using the theory of speech production, a lossy Kelly-Lochbaum vocal tract model equipped with lip radiation impedance and variable lip rounding length is created. A lookup table consisting of correspondences between spectral qualities of instantaneous speech signals and articulatory shapes is created using this model. The lookup table can be used to perform acoustic-to-articulatory mapping. The obtained model is used in estimation of vocal tract shape trajectories in continuous speech. Smooth and minimum energy trajectories are found by using a simple optimization algorithm

    Prototype modeling of vowel perception and production in a quantity language

    Get PDF
    Vowel prototypes refer to the psychological memory representations of the best exemplars of a vowel category. This thesis examines the role of prototypes in the perception and production of Finnish short and long vowels. A comparison with German as a linguistically different language with a similar vowel system is also made. The thesis reports on a series of four experiments in which prototypes are examined by means of behavioral psychoacoustic measurements and compared with vowel productions in quiet and in noise. In the perception experiments, Finnish and German listeners were asked to identify and evaluate the goodness of synthesized vowels representing either the entire vowel space or selected subareas of the space. In the production experiments, only Finnish speakers were recruited, but earlier reported production data were used for the comparison of Finnish and German. The new concept of the weighted prototype (Pω) is introduced in Study I, and its usability in contrast to absolute prototypes (Pa) and category centroids (Pc) is examined in Study IV. Generally, the results support the finding that vowel categories are not homogenous in quality, but have an internal structure, and that there are significant quality differences between category members in terms of goodness ratings. The results of Studies I, II and III support the identity group interpretation of the Finnish quantity opposition by showing that the differences in the perceived quality and in the produced short and long vowels are not demonstrably dependent on the physical duration of the stimuli, although the production experiments in Studies I and III indicated that the short peripheral vowels, especially /u/ in Study III, are more centralized in the vowel space than the long vowels. On the basis of the results of Study II, the spectral and durational local effective vowel indicators of the initial auditory theory of vowel perception appear to be independent of each other, thus suggesting that the auditory vowel space (AVS) is orthogonal in terms of the measures used in the experiment. Furthermore, the reaction time results of Study II indicate that stimulus typicality in terms of vowel quantity affects the categorization process of quality but not its end result. The noise masking of production in Study III indicated that both of the noise types applied in the experiment, pink noise and babble noise, resulted in a prolongation of all vowel durations as reported earlier on the Lombard effect. However, the noise masking did not affect the Euclidean distances between the short and long vowels, but caused a minor systematic drift on F1–F2 space in both vowel types. The minor differences suggest that prototypes act as articulatory targets in a fire-and-forget manner without the auditory feedback affecting the immediate articulation. The results concerning the different prototype measures indicated that the Pa and Pω differ significantly from the Pc, with the Pa being most peripheral. This gives some support to the adaptive dispersion effect in perception. The individual variations of the measures were normally distributed, with some exceptions for Pa in Finnish, and were, in terms of the coefficient of variation (CV), of the order of difference limen (DL) of frequency. These results suggest that, for normally distributed prototypes, and especially for Pω, which showed the least variation, two thirds of the subjects detected the best category representatives from a subset of stimuli that lie within the limits of DL of frequency from each other in the F1–F2 space. This finding can be regarded as a strong evidence for prototype theories, in other words, the best category representatives play a role by acting as templates in vowel perception. The listeners were able to recognize quality differences between and within vowel categories, but the majority of them ranked the best category exemplars from a subset of stimuli that were hardly distinguishable from each other. There were some minor differences in the vowel systems of Finnish and German as indicated by the different prototype measures: the absolute prototypes showed the largest differences between the languages in /e/, / ø/ and /u/. This is in line with the earlier investigations on produced vowels in Finnish and German. Generally, the vowel systems of these two linguistically unrelated languages were strikingly similar, especially in the light of the Pω measure. As presented in this thesis, the prototype approach provides a feasible tool for research and the results lend support to the idea that speech comprehension on the auditory, phonetic, and even on phonological processing levels is based on the memory representations of typical speech sounds of one’s native tongue, formed during the early language acquisition phase, and these representations may be similar for the speakers and listeners of two different languages with comparable vowel systems
    corecore