7 research outputs found

    Magnetic resonance imaging of the brain and vocal tract:Applications to the study of speech production and language learning

    Get PDF
    The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic “talent”. In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI – specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions

    Analyzing speech in both time and space : generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI

    Get PDF
    We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech

    Fast upper airway magnetic resonance imaging for assessment of speech production and sleep apnea

    Get PDF
    The human upper airway is involved in various functions, including speech, swallowing, and respiration. Magnetic resonance imaging (MRI) can visualize the motion of the upper airway and has been used in scientific studies to understand the dynamics of vocal tract shaping during speech and for assessment of upper airway abnormalities related to obstructive sleep apnea and swallowing disorders. Acceleration technologies in MRI are crucial in improving spatiotemporal resolution or spatial coverage. Recent trends in technical aspects of upper airway MRI are to develop state-of-the-art image acquisition methods for improved dynamic imaging of the upper airway and develop automatic image analysis methods for efficient and accurate quantification of upper airway parameters of interest. This review covers the fast upper airway magnetic resonance (MR) acquisition and reconstruction, MR experimental issues, image analysis techniques, and applications, mainly with respect to studies of speech production and sleep apnea

    リアルタイムMRI調音動画データの閲覧および解析環境の開発

    Get PDF
    Waseda UniversityWaseda UniversityNational Institute for Japanese Language and Linguistics会議名: 言語資源活用ワークショップ2021, 開催地: オンライン, 会期: 2021年9月13日-14日, 主催: 国立国語研究所 コーパス開発センター近年, 音声の調音運動をMRI装置を用いて, リアルタイムに撮像することが可能になった.リアルタイムMRI(以下rtMRI)データは, 声道の正中矢状面全体の情報が含まれ, 調音音声学の再構築を促す可能性を秘めている. しかし, 収集されたデータに対する転記方法や分析方法は整備されているとは言いづらく, 単にデータを公開するだけではrtMRIデータに基づいた調音音声学研究は普及しにくいと予想される. そのため, 我々は過去にrtMRIデータ閲覧ツールとしてMRI Vuewerの設計と開発を行った. 上記ツールはrtMRIデータの音声的, 時間的側面の転記機能を有するものであったが, 画像的, 空間的側面の転記機能に不足があった. 本研究では近年行われたrtMRIデータを用いた研究から画像的, 空間的側面の転記に必要な機能を整理し, rtMRIデータ解析ツールとしてMRI Vuewerの再設計と再実装した結果を報告する

    Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis

    No full text
    A method of rapid semi-automatic segmentation of real-time magnetic resonance image data for parametric analysis of vocal tract shaping is described. Tissue boundaries are identified by seeking pixel intensity thresholds along tract-normal grid-lines. Airway contours are constrained with respect to a tract centerline defined as an optimal path over the graph of all intensity minima between the glottis and lips. The method allows for superimposition of reference boundaries to guide automatic segmentation of anatomical features which are poorly imaged using magnetic resonance - dentition and the hard palate - resulting in more accurate sagittal sections than those produced by fully automatic segmentation. We demonstrate the utility of the technique in the dynamic analysis of tongue shaping in Tamil liquid consonants.4 page(s

    The phonetic correlates of pharyngealization and pharyngealization spread patterns in Cairene Arabic an acoustic and real-time magnetic resonance imaging study

    Get PDF
    The major articulatory differences between plain and pharyngealized speech sounds in Arabic are a secondary posterior constriction and a lowered tongue body implicated in the production of the latter type. This articulatory configuration, pharyngealization, affects neighboring segments according to spread patterns that differ across different dialects in both direction and domain (distance). The most prominent acoustic consequence of this articulatory configuration is a lowering of the second formant frequency in surrounding vowels. The extent of the modification in the formant frequency is determined by the length and quality of the vowel. This study uses real-time magnetic resonance imaging (rtMRI) to investigate the acoustic and articulatory correlates of pharyngealization and pharyngealization spread in Cairene Arabic. The articulatory and acoustic correlates of pharyngealization and pharyngealization spread relate to phonetics and phonology, respectively. This study is thus at the interface of phonetics and phonology, presenting phonetic evidence for a phonological phenomenon. Four male native speakers of Cairene Arabic participated in the study. They were trained to repeat a carrier phrase inside the MRI scanner: /ʔal:aha: X ʔalf mar:a/ (‘He told her X one thousand times’, where X is the target word). Target words are monosyllabic minimal pairs of Cairene Arabic in which the plain-pharyngealized contrast occurs at the edges of the word, and in which the vowels immediately adjacent to the plain-pharyngealized contrast are /a:, i:, u:/ and /a, i, u/. The role of both vowel length and vowel quality in the extent of pharyngealization spread was examined, as well as the influence of rightward versus leftward spread of pharyngealization. The acquired rtMRI data is reconstructed using the Partial Separability model to achieve high temporal resolution (approximately 100 fps) and high spatial resolution (128 × 128 voxels (volume elements), with each voxel measuring 2.2 mm × 2.2 mm × 8.0 mm (through-plane depth). Midsagittal MRI frames are extracted at the middle of the consonants and vowels of the target words. They show the lingual and pharyngeal configuration during the articulation of each speech segment. An edge detection method is applied to identify the contours of the vocal tract from the glottis to the lips. These contours are analyzed in Matlab to examine the articulatory configuration of the sounds of interest. Two articulatory measures, 2D pharyngeal areas and 2D oral areas, are introduced to quantify the magnitude of the pharyngeal constriction and the oral cavity, respectively. These provide articulatory measurements of pharyngealization spread across different vowel qualities, different vowel lengths, and different directions. Results suggest that the magnitude of pharyngealization spread differs with respect to these three factors. Parallel acoustic data is acquired from the same four speakers in a sound attenuating booth and analyzed in Praat to examine the acoustic properties (i.e. the formant frequencies) of the sounds of interest. Results from articulatory measurements are corroborated with results from acoustic measurements of formant frequency modifications

    <全文>言語資源活用ワークショップ2021発表論文集

    Get PDF
    会議名: 言語資源活用ワークショップ2021, 開催地: オンライン, 会期: 2021年9月13日-14日, 主催: 国立国語研究所 コーパス開発センタ
    corecore