1,652 research outputs found
Inter-speaker speech variability assessment using statistical deformable models from 3.0 Tesla magnetic resonance images
The morphological and dynamic characterisation of the vocal tract during speech production has been gaining greater attention due to the motivation of the latest improvements in magnetic resonance (MR) imaging; namely, with the use of higher magnetic fields, such as 3.0 Tesla. In this work, the automatic study of the vocal tract from 3.0 Tesla MR images was assessed through the application of statistical deformable models. Therefore, the primary goal focused on the analysis of the shape of the vocal tract during the articulation of European Portuguese sounds, followed by the evaluation of the results concerning the automatic segmentation, i.e. identification of the vocal tract in new MR images. In what concerns speech production, this is the first attempt to automatically characterise and reconstruct the vocal tract shape of 3.0 Tesla MR images by using deformable models; particularly, by using active and appearance shape models. The achieved results clearly evidence the adequacy and advantage of the automatic analysis of the 3.0 Tesla MR images of these deformable models in order to extract the vocal tract shape and assess the involved articulatory movements. These achievements are mostly required, for example, for a better knowledge of speech production, mainly of patients suffering from articulatory disorders, and to build enhanced speech synthesizer models.info:eu-repo/semantics/publishedVersio
Recommended from our members
Analysis of fuzzy clustering and a generic fuzzy rule-based image segmentation technique
Many fuzzy clustering based techniques when applied to image segmentation do not incorporate spatial relationships of the pixels, while fuzzy rule-based image segmentation techniques are generally application dependent. Also for most of these techniques, the structure of the membership functions is predefined and parameters have to either automatically or manually derived. This paper addresses some of these issues by introducing a new generic fuzzy rule based image segmentation (GFRIS) technique, which is both application independent and can incorporate the spatial relationships of the pixels as well. A qualitative comparison is presented between the segmentation results obtained using this method and the popular fuzzy c-means (FCM) and possibilistic c-means (PCM) algorithms using an empirical discrepancy method. The results demonstrate this approach exhibits significant improvements over these popular fuzzy clustering algorithms for a wide range of differing image types
Magnetic resonance imaging of the vocal tract: techniques and applications
Magnetic resonance (MR) imaging has been used to analyse and evaluate the vocal tract shape through different techniques and with promising results in several fields. Our purpose is to demonstrate the relevance of MR and image processing for the vocal tract study. The extraction of contours of the air cavities allowed the set-up of a number of 3D reconstruction image stacks by means of the combination of orthogonally oriented sets of slices for each articulatory gesture, as a new approach to solve the expected spatial under sampling of the imaging process. In result these models give improved information for the visualization of morphologic and anatomical aspects and are useful for partial measurements of the vocal tract shape in different situations. Potential use can be found in Medical and therapeutic applications as well as in acoustic articulatory speech modelling
Magnetic resonance imaging of the vocal tract: techniques and applications
Magnetic resonance (MR) imaging has been used to analyse and evaluate the vocal tract shape through different techniques and with promising results in several fields. Our purpose is to demonstrate the relevance of MR and image processing for the vocal tract study. The extraction of contours of the air cavities allowed the set - up of a number of 3D reconstruction image stacks by means of the combination of orthogonally oriented sets of slices for e ach articulatory gesture, as a new approach to solve the expected spatial under sampling of the imaging process. In result these models give improved information for the visualization of morphologic and anatomical aspects and are useful for partial measure ments of the vocal tract shape in different situations. Potential use can be found in Medical and therapeutic applications as well as in acoustic articulatory speech modelling
Magnetic resonance imaging of the brain and vocal tract:Applications to the study of speech production and language learning
The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic “talent”. In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI – specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions
Recommended from our members
A generic fuzzy rule based technique for image segmentation
Many fuzzy clustering based techniques do not incorporate the spatial relationships of the pixels, while all fuzzy rule based image segmentation techniques tend to be very much application dependent. In most techniques, the structure of the membership functions are predefined and their parameters are either automatically or manually determined. This paper addresses the aforementioned problems by introducing a general fuzzy rule based image segmentation technique, which is application independent and can also incorporate the spatial relationships of the pixels. It also proposes the automatic defining of the structure of the membership functions. A qualitative comparison is made between the segmentation results using this method and the popular fuzzy c-means (FCM) applied to two types of images: light intensity (LI) and an X-ray of the human vocal tract. The results clearly show that this method exhibits significant improvements over FCM for both types of image
Analyzing speech in both time and space : generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI
We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech
- …