77 research outputs found

    Tracking Articulators in X-ray Movies of the Vocal Tract

    Full text link

    Segmentation of X-ray Image Sequences Showing the Vocal Tract (with tool documentation)

    Get PDF
    The tongue, the lips, the palate, and the throat are tracked in X-ray images showing the side view of the vocal tract. This is performed by using specialized histogram normalization techniques and a new tracking method that is robust against occlusion, noise, and spontaneous, non-linear deformations of objects. Although the segmentation procedure is optimized for the X-ray images of the vocal tract, the underlying tracking method can easily be used in other applications

    A multilinear tongue model derived from speech related MRI data of the human vocal tract

    Get PDF
    We present a multilinear statistical model of the human tongue that captures anatomical and tongue pose related shape variations separately. The model is derived from 3D magnetic resonance imaging data of 11 speakers sustaining speech related vocal tract configurations. The extraction is performed by using a minimally supervised method that uses as basis an image segmentation approach and a template fitting technique. Furthermore, it uses image denoising to deal with possibly corrupt data, palate surface information reconstruction to handle palatal tongue contacts, and a bootstrap strategy to refine the obtained shapes. Our evaluation concludes that limiting the degrees of freedom for the anatomical and speech related variations to 5 and 4, respectively, produces a model that can reliably register unknown data while avoiding overfitting effects. Furthermore, we show that it can be used to generate a plausible tongue animation by tracking sparse motion capture data

    Multimodal acquisition of articulatory data: Geometrical and temporal registration

    Get PDF
    International audienceAcquisition of dynamic articulatory data is of major importance for studying speech production. It turns out that one technique alone often is not enough to get a correct coverage of the whole vocal tract at a sufficient sampling rate. Ultrasound (US) imaging has been proposed as a good acquisition technique for the tongue surface because it offers a good temporal sampling, does not alter speech production, is cheap and widely available. However, it cannot be used alone and this paper describes a multimodal acquisition system which uses electromagnetography sensors to locate the US probe. The paper particularly focuses on the calibration of the ultrasound modality which is the key point of the system. This approach enables ultrasound data to be merged with other data. The use of the system is illustrated via an experiment consisting of measuring the minimal tongue to palate distance in order to evaluate and design Magnetic Resonance Imaging protocols well suited for the acquisition of 3D images of the vocal tract. Compared to manual registration of acquisition modalities which is often used in acquisition of articulatory data, the approach presented relies on automatic techniques well founded from geometrical and mathematical points of view

    A real-time MRI study of Japanese moraic nasal in utterance-final position

    Get PDF
    National Institute for Japanese Language and LinguisticsMoraic nasal of Japanese, often symbolized as /N/, is a nasal segment that has the status of an independent mora. It is widely acknowledged that the place of articulation of /N/ is determined by the assimilation to the following consonants; for example, /aNma/, /aNta/, and /aNka/ become [amma], [anta], and [aŋka] respectively. There is, however, a lack of consensus concerning the realization of /N/ in the utterance-final position. Places of articulation of utterance-final /N/ hitherto stipulated in the literatures include velar [ŋ], uvular [N], and nasalized vowels. A real-time MRI movie database was analyzed to solve this problem. Data of three male speakers revealed consistent results. The location of closure for the final /N/ is highly predictable by the membership of the immediately preceding vowel. Closure locations predicted by a generalized linear mixed-effect model regression analysis showed high correlation (between .887-.986) with the observed locations

    Fast upper airway magnetic resonance imaging for assessment of speech production and sleep apnea

    Get PDF
    The human upper airway is involved in various functions, including speech, swallowing, and respiration. Magnetic resonance imaging (MRI) can visualize the motion of the upper airway and has been used in scientific studies to understand the dynamics of vocal tract shaping during speech and for assessment of upper airway abnormalities related to obstructive sleep apnea and swallowing disorders. Acceleration technologies in MRI are crucial in improving spatiotemporal resolution or spatial coverage. Recent trends in technical aspects of upper airway MRI are to develop state-of-the-art image acquisition methods for improved dynamic imaging of the upper airway and develop automatic image analysis methods for efficient and accurate quantification of upper airway parameters of interest. This review covers the fast upper airway magnetic resonance (MR) acquisition and reconstruction, MR experimental issues, image analysis techniques, and applications, mainly with respect to studies of speech production and sleep apnea

    Segmentation of tongue shapes during vowel production in magnetic resonance images based on statistical modelling

    Get PDF
    Quantification of the anatomic and functional aspects of the tongue is pertinent to analyse the mechanisms involved in speech production. Speech requires dynamic and complex articulation of the vocal tract organs, and the tongue is one of the main articulators during speech production. Magnetic resonance imaging has been widely used in speech-related studies. Moreover, the segmentation of such images of speech organs is required to extract reliable statistical data. However, standard solutions to analyse a large set of articulatory images have not yet been established. Therefore, this article presents an approach to segment the tongue in two-dimensional magnetic resonance images and statistically model the segmented tongue shapes. The proposed approach assesses the articulator morphology based on an active shape model, which captures the shape variability of the tongue during speech production. To validate this new approach, a dataset of mid-sagittal magnetic resonance images acquired from four subjects was used, and key aspects of the shape of the tongue during the vocal production of relevant European Portuguese vowels were evaluated

    Real-Time Magnetic Resonance Imaging

    Get PDF
    Real‐time magnetic resonance imaging (RT‐MRI) allows for imaging dynamic processes as they occur, without relying on any repetition or synchronization. This is made possible by modern MRI technology such as fast‐switching gradients and parallel imaging. It is compatible with many (but not all) MRI sequences, including spoiled gradient echo, balanced steady‐state free precession, and single‐shot rapid acquisition with relaxation enhancement. RT‐MRI has earned an important role in both diagnostic imaging and image guidance of invasive procedures. Its unique diagnostic value is prominent in areas of the body that undergo substantial and often irregular motion, such as the heart, gastrointestinal system, upper airway vocal tract, and joints. Its value in interventional procedure guidance is prominent for procedures that require multiple forms of soft‐tissue contrast, as well as flow information. In this review, we discuss the history of RT‐MRI, fundamental tradeoffs, enabling technology, established applications, and current trends

    Augmented Reality Talking Heads as a Support for Speech Perception and Production

    Get PDF

    Cardiac magnetic resonance assessment of central and peripheral vascular function in patients undergoing renal sympathetic denervation as predictor for blood pressure response

    Get PDF
    Background: Most trials regarding catheter-based renal sympathetic denervation (RDN) describe a proportion of patients without blood pressure response. Recently, we were able to show arterial stiffness, measured by invasive pulse wave velocity (IPWV), seems to be an excellent predictor for blood pressure response. However, given the invasiveness, IPWV is less suitable as a selection criterion for patients undergoing RDN. Consequently, we aimed to investigate the value of cardiac magnetic resonance (CMR) based measures of arterial stiffness in predicting the outcome of RDN compared to IPWV as reference. Methods: Patients underwent CMR prior to RDN to assess ascending aortic distensibility (AAD), total arterial compliance (TAC), and systemic vascular resistance (SVR). In a second step, central aortic blood pressure was estimated from ascending aortic area change and flow sequences and used to re-calculate total arterial compliance (cTAC). Additionally, IPWV was acquired. Results: Thirty-two patients (24 responders and 8 non-responders) were available for analysis. AAD, TAC and cTAC were higher in responders, IPWV was higher in non-responders. SVR was not different between the groups. Patients with AAD, cTAC or TAC above median and IPWV below median had significantly better BP response. Receiver operating characteristic (ROC) curves predicting blood pressure response for IPWV, AAD, cTAC and TAC revealed areas under the curve of 0.849, 0.828, 0.776 and 0.753 (p = 0.004, 0.006, 0.021 and 0.035). Conclusions: Beyond IPWV, AAD, cTAC and TAC appear as useful outcome predictors for RDN in patients with hypertension. CMR-derived markers of arterial stiffness might serve as non-invasive selection criteria for RDN
    corecore