Search CORE

2,558 research outputs found

The Effects of Humming and Pitch on Craniofacial and Craniocervical Morphology Measured Using MRI

Author: Aspden Richard Malcolm
Gilbert Fiona Jane
Gregory Jenny
Miller Nicola Anne
Semple Scott
Stollery Pete
Publication venue: 'Elsevier BV'
Publication date: 25/03/2011
Field of study

Peer reviewedPreprin

Aberdeen University Research

Crossref

Relationships Between Vocal Structures, the Airway, and Craniocervical Posture Investigated Using Magnetic Resonance Imaging

Author: Aspden Richard Malcolm
Gilbert Fiona Jane
Gregory Jenny
Miller Nicola Anne
Semple Scott
Stollery Pete
Publication venue: 'Elsevier BV'
Publication date: 15/01/2011
Field of study

Peer reviewedPreprin

Aberdeen University Research

Crossref

Ultrax:An Animated Midsagittal Vocal Tract Display for Speech Therapy

Author: Renals S.
Richmond K.
Publication venue
Publication date: 01/09/2012
Field of study

Speech sound disorders (SSD) are the most common communication impairment in childhood, and can hamper social development and learning. Current speech therapy interventions rely predominantly on the auditory skills of the child, as little technology is available to assist in diagnosis and therapy of SSDs. Realtime visualisation of tongue movements has the potential to bring enormous benefit to speech therapy. Ultrasound scanning offers this possibility, although its display may be hard to interpret. Our ultimate goal is to exploit ultrasound to track tongue movement, while displaying a simplified, diagrammatic vocal tract that is easier for the user to interpret. In this paper, we outline a general approach to this problem, combining a latent space model with a dimensionality reducing model of vocal tract shapes. We assess the feasibility of this approach using magnetic resonance imaging (MRI) scans to train a model of vocal tract shapes, which is animated using electromagnetic articulography (EMA) data from the same speaker. Index Terms: Ultrasound, speech therapy, vocal tract visualisation 1

CiteSeerX

Edinburgh Research Explorer

Deep-Learning-Based Methods for Automatic Articulator and Levator Veli Palatini Segmentation and Motion Quantification in Magnetic Resonance Images of the Vocal Tract

Author: Ruthven Matthieu
Publication venue
Publication date: 01/10/2023
Field of study

King's Research Portal

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Author: Bliesener Yannick
Byrd Dani
Chen Weiyi
Godinez Bianca
Goldstein Louis
Harper Sarah
Lee Yoonjeong
Lim Yongwan
Lingala Sajan Goud
Montesserin Mairym Lloréns
Narayanan Shrikanth S.
Nayak Krishna S.
Oh Miran
Smith Caitlin
Sorensen Tanner
Tian Ye
Toutios Asterios
Töger Johannes
Vaz Colin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2021
Field of study

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 subjects performing linguistically motivated speech tasks, alongside the corresponding first-ever public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each subject.Comment: 27 pages, 6 figures, 5 tables, submitted to Nature Scientific Dat

arXiv.org e-Print Archive

Directory of Open Access Journals

Acoustic pulse reflectometry for measurement of the vocal tract and application in voice synthesis

Author: Gray Calum David
Publication venue: The University of Edinburgh
Publication date: 01/01/2005
Field of study

Edinburgh Research Archive

Rapid dynamic speech imaging at 3 Tesla using combination of a custom vocal tract coil, variable density spirals and manifold regularization

Author: Ahmed Abdul Haseeb
Alam Wahidul
Howard David
Jacob Mathews
Kruger Stanley
Lingala Sajan Goud
Meyer David
Rusho Rushdi Zahid
Titze Ingo
Publication venue
Publication date: 06/09/2022
Field of study

Purpose: To improve dynamic speech imaging at 3 Tesla. Methods: A novel scheme combining a 16-channel vocal tract coil, variable density spirals (VDS), and manifold regularization was developed. Short readout duration spirals (1.3 ms long) were used to minimize sensitivity to off-resonance. The manifold model leveraged similarities between frames sharing similar vocal tract postures without explicit motion binning. Reconstruction was posed as a SENSE-based non-local soft weighted temporal regularization scheme. The self-navigating capability of VDS was leveraged to learn the structure of the manifold. Our approach was compared against low-rank and finite difference reconstruction constraints on two volunteers performing repetitive and arbitrary speaking tasks. Blinded image quality evaluation in the categories of alias artifacts, spatial blurring, and temporal blurring were performed by three experts in voice research. Results: We achieved a spatial resolution of 2.4mm2/pixel and a temporal resolution of 17.4 ms/frame for single slice imaging, and 52.2 ms/frame for concurrent 3-slice imaging. Implicit motion binning of the manifold scheme for both repetitive and fluent speaking tasks was demonstrated. The manifold scheme provided superior fidelity in modeling articulatory motion compared to low rank and temporal finite difference schemes. This was reflected by higher image quality scores in spatial and temporal blurring categories. Our technique exhibited faint alias artifacts, but offered a reduced interquartile range of scores compared to other methods in alias artifact category. Conclusion: Synergistic combination of a custom vocal-tract coil, variable density spirals and manifold regularization enables robust dynamic speech imaging at 3 Tesla.Comment: 30 pages, 10 figure

arXiv.org e-Print Archive

Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh

Author: Daffern Helena
Gully Amelia Jane
Murphy Damian Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2018
Field of study

Articulatory speech synthesis has the potential to offer more natural sounding synthetic speech than established concatenative or parametric synthesis methods. Time-domain acoustic models are particularly suited to the dynamic nature of the speech signal, and recent work has demonstrated the potential of dynamic vocal tract models that accurately reproduce the vocal tract geometry. This paper presents a dynamic 3D digital waveguide mesh (DWM) vocal tract model, capable of movement to produce diphthongs. The technique is compared to existing dynamic 2D and static 3D DWM models, for both monophthongs and diphthongs. The results indicate that the proposed model provides improved formant accuracy over existing DWM vocal tract models. Furthermore, the computational requirements of the proposed method are significantly lower than those of comparable dynamic simulation techniques. This paper represents another step toward a fully functional articulatory vocal tract model which will lead to more natural speech synthesis systems for use across society

Crossref

White Rose Research Online