52 research outputs found

    Techniques for the enhancement of linear predictive speech coding in adverse conditions

    Get PDF

    An investigation of sagittal velar movement and its correlation with lip, tongue, and jaw movement.

    Get PDF
    This paper examines the correlation between velar movement and the movement of other articulators in the midsagittal plane. A physiological model is proposed, which, while being based on common knowledge, is more extensive than has been used explicitly to explain observed movements of the velum. The model is used to guide the measurements taken from raw articulatory data provided by Electromagnetic Articulograph (EMA). The midsagittal velar movement is then examined for four different sentences by separate speakers and attempts are made to explain the patterns by reference to the model. The data was taken from a database of between 220 and 460 phonetically balanced sentences per speaker. This type of dataset allow general patterns of behavior to be uncovered. One such observation made and discussed in this paper is velum lowering before oral velar stops.casl1pub2556pu

    A Multi-Channel/Multi-Speaker Articulatory Database for Continuous Speech Recognition Research.

    Get PDF
    The goal of this research is to improve the performance of a speaker-independent Automatic Speech Recognition (ASR) system by using directly measured articulatory parameters in the training phase. This paper examines the need for a multi-channel/multi-speaker articulatory database and describes the design of such a database and the processes involved in its creation.casl5pub2489pu

    Towards a 3D Tongue model for parameterising ultrasound data

    Get PDF
    This paper describes the process and aims of the manual construction of a 3D mesh modelling the tongue, hyoid and mandible. The mesh building process includes the ability to assign muscles to mesh struts which can be independently contracted to nominal lengths in order to test how the mesh deforms. In this way the behaviour of the mesh can be easily and quickly observed and structures can be amended or enhanced. One such mesh is described which is based on a laminar structure and where the genioglossus is divided into five functionally independent compartments. The model is capable of being deformed to fit midsagittal MRI data of a wide range of distinct articulations by a single speaker and by carefully identifying landmark features and orientation, can also be fitted to ultrasound images for that same speaker.The main aim of this project is to develop research skills in Further Education lecturers who are involved in both FE and HE delivery. Recent developments in FE have recognised the needs to develop research capacity within FE institutions and a number of networks have responded[1] (UHI Millennium Institute (UHI) is a partnership of 15 colleges and research institutions in the Highlands and Islands of Northern Scotland. See also the FE Regional Research Network (FERRN) for Fife and the Lothians. The importance of research is well embedded in both networks), while some recent research has focussed on the role of FE in furthering the government's continuous improvement agenda.Many staff within the colleges that form UHI Millennium Institute (UHI) are in a position of having to teach at both FE and HE level and are increasingly expected to engage with research. Traditionally, however, FE lecturers have not engaged in research and have therefore not developed the required skills[2].This project aims to develop basic research skills through the planning and execution of a small-scale project linked to a relevant literature search that relates to one aspect of the individual's teaching practice. It aims to encourage the development of a community of researching practitioners by establishing a mainly online discussion group to support practitioners[3]. The project thus aims to develop research and collaborative skills whilst also developing a greater understanding of teaching and learning and should therefore encourage reflection on teaching and learning and potentially impact on future delivery.1.Cunningham J. and Doncaster K. Developing a Research Culture in the FE Sector: a case-study of work-based approach to staff development. Journal of Further and Higher Education., Vol. 26, No 1, 20022.Hillier, Y. and Jameson J. Empowering Researchers in FE. Trentham. 20033.Wenger, E. Communities of Practice: learning, meaning and identity. CUP. 1998caslpub3963pub76

    Towards a 3D Tongue model for parameterising ultrasound data

    Get PDF
    This paper describes the process and aims of the manual construction of a 3D mesh modelling the tongue, hyoid and mandible. The mesh building process includes the ability to assign muscles to mesh struts which can be independently contracted to nominal lengths in order to test how the mesh deforms. In this way the behaviour of the mesh can be easily and quickly observed and structures can be amended or enhanced. One such mesh is described which is based on a laminar structure and where the genioglossus is divided into five functionally independent compartments. The model is capable of being deformed to fit midsagittal MRI data of a wide range of distinct articulations by a single speaker and by carefully identifying landmark features and orientation, can also be fitted to ultrasound images for that same speaker.The main aim of this project is to develop research skills in Further Education lecturers who are involved in both FE and HE delivery. Recent developments in FE have recognised the needs to develop research capacity within FE institutions and a number of networks have responded[1] (UHI Millennium Institute (UHI) is a partnership of 15 colleges and research institutions in the Highlands and Islands of Northern Scotland. See also the FE Regional Research Network (FERRN) for Fife and the Lothians. The importance of research is well embedded in both networks), while some recent research has focussed on the role of FE in furthering the government's continuous improvement agenda.Many staff within the colleges that form UHI Millennium Institute (UHI) are in a position of having to teach at both FE and HE level and are increasingly expected to engage with research. Traditionally, however, FE lecturers have not engaged in research and have therefore not developed the required skills[2].This project aims to develop basic research skills through the planning and execution of a small-scale project linked to a relevant literature search that relates to one aspect of the individual's teaching practice. It aims to encourage the development of a community of researching practitioners by establishing a mainly online discussion group to support practitioners[3]. The project thus aims to develop research and collaborative skills whilst also developing a greater understanding of teaching and learning and should therefore encourage reflection on teaching and learning and potentially impact on future delivery.1.Cunningham J. and Doncaster K. Developing a Research Culture in the FE Sector: a case-study of work-based approach to staff development. Journal of Further and Higher Education., Vol. 26, No 1, 20022.Hillier, Y. and Jameson J. Empowering Researchers in FE. Trentham. 20033.Wenger, E. Communities of Practice: learning, meaning and identity. CUP. 1998caslpub3963pub76

    A new EPG protocol for assessing DDK accuracy scores in children : a Down's syndrome study

    Get PDF
    Recent research has suggested that eliciting diadochokinetic (DDK) rate and accuracy in young children is difficult [1], with analysis being timeconsuming.This paper details a new protocol for assessing DDK in young children or children with intellectual impairment (Down's syndrome) and a method for calculating accuracy scores automatically. Accuracy scores were calculated from auditory and electropalatographic analyses and found to correlate in some instances. The children with Down's syndrome presented with similar DDK rates to typically-developing children but reduced accuracy.casl[1] Cohen, W., Waters, D., Hewlett, N., 1998, DDK rates in the paediatric clinic: a methodological minefield. International Journal of Language and Communication Disorders, 33, supplement, 428-433. [2] Dodd, B., Hua, Z., Crosbie, S., Holm, A., 2002, Diagnostic Evaluation of Articulation and Phonology. London: The Psychological Corporation. [3] Fletcher, S.G., 1978. The Fletcher time-by-count Test of Diadochokinetic Syllable Rate. Austin: PRO-ED. [4] Kumin, L., 2006. Speech intelligibility and childhood verbal apraxia in children with Down syndrome. Down Syndrome Research and Practice, 10, 10-22. [5] Robbins, J., Klee,T., 1987, Clinical Assessment of Oropharyngeal Motor Development in Young Children. Journal of Speech and Hearing Disorders, 52, 271-277. [6] Thoonen, G., Maassen, B., Wit, J., Gabreels, F., Schreuder, R., 1996, The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clinical Linguistics and Phonetics, 10, 311-336. [7] Williams, P. and Stackhouse, J., 2000. Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics, 14, 267-293pub43pu

    Very high frame rate ultrasound tongue imaging

    Get PDF
    This paper examines the trade-off between temporal and spatial resolution in ultrasound tongue images at fast frame rates. The fastest lingual speech movements are investigated using a variety of echo pulse densities. Benefits and drawbacks of using higher frame rates are considered. Faster frame rates reduce distortion of the shape of the tongue during highly dynamic segments but it becomes increasingly difficult to discern the detail of that shape. The best temporal and spatial resolution is achieved with shorter distances between the probe and the tongue surface.caslpub2505pu

    Spatio-temporal inaccuracies of video-based ultrasound images of the tongue.

    Get PDF
    Abstract. This paper focuses on aspects of ultrasound technology that have an impact on the accuracy of this technique as an investigative tool for the study of displacement, timing and movement of the tongue during speech. The paper describes settings and elements in the design of ultrasound systems that can affect spatial and temporal resolution and provides recommendations for how to minimize distortion.caslpub2151pu

    High-speed Cineloop Ultrasound vs. Video Ultrasound Tongue Imaging: Comparison of Front and Back Lingual Gesture Location and Relative Timing.

    Get PDF
    We compare two methods of acquiring ultrasound tongue images. A new system capable of recording directly from the cineloop image buffer at a high frame rate and which is more accurately synchronized with audio is compared with an optimised method of recording images via the NTSC video output of an ultrasound machine. As a focus for this comparison we gathered representative data on English /l from a single speaker, using a headset restraint system. Both systems performed well, but while the video system is at its limits, the cineloop system is inherently more accurate and offers greater opportunity for development.caslpub2012pu

    Beyond the edge: Markerless pose estimation of speech articulators from ultrasound and camera images using DeepLabCut

    Get PDF
    Automatic feature extraction from images of speech articulators is currently achieved by detecting edges. Here, we investigate the use of pose estimation deep neural nets with transfer learning to perform markerless estimation of speech articulator keypoints using only a few hundred hand-labelled images as training input. Midsagittal ultrasound images of the tongue, jaw, and hyoid and camera images of the lips were hand-labelled with keypoints, trained using DeepLabCut and evaluated on unseen speakers and systems. Tongue surface contours interpolated from estimated and hand-labelled keypoints produced an average mean sum of distances (MSD) of 0.93, s.d. 0.46 mm, compared with 0.96, s.d. 0.39 mm, for two human labellers, and 2.3, s.d. 1.5 mm, for the best performing edge detection algorithm. A pilot set of simultaneous electromagnetic articulography (EMA) and ultrasound recordings demonstrated partial correlation among three physical sensor positions and the corresponding estimated keypoints and requires further investigation. The accuracy of the estimating lip aperture from a camera video was high, with a mean MSD of 0.70, s.d. 0.56, mm compared with 0.57, s.d. 0.48 mm for two human labellers. DeepLabCut was found to be a fast, accurate and fully automatic method of providing unique kinematic data for tongue, hyoid, jaw, and lips.https://doi.org/10.3390/s2203113322pubpub
    • …
    corecore