183 research outputs found

    Data-driven analysis of nasal vowels dynamics and coordination: Results for bilabial context

    Get PDF
    One of Portuguese distinctive marks is the large inventory of nasals, including 5 nasal vowels and many diphthongs. Acoustic and articulatory studies showed nasal vowels having an initial oral part and a short nasal tail, probably related to synchronization between oral and nasal gestures. Previous studies have considered discrete descriptions with EMA-flesh points, limiting our grasp of the whole vocal tract, and preliminary work using real-time MRI (RT-MRI), considered a small framerate (14fps) and a reduced number of speakers, influencing both the time resolution to study an intrinsically dynamic process and the generalization of the outcomes. The recent advances of RT-MRI, with framerates of 50fps, have opened new possibilities for studies that can grasp a finer detail of the dynamics of nasals. However, new challenges need to be tackled to deal with the resulting large amount of data and to foster analyses that go beyond qualitative approaches to tackle a larger number of speakers. Grounded on a new RT-MRI corpus for European Portuguese, this paper explores the capabilities of recent data-driven methods, proposed for this type of RT-MRI data, to analyze dynamic aspects of nasal vowels and coordination. To this end, we consider data for 11 EP speakers and investigate vocal tract configurations, over time, and the coordination of velum and lip aperture in bilabial (oral and nasal) contexts

    Three-dimensional modeling of tongue during speech using MRI data

    Get PDF
    The tongue is the most important and dynamic articulator for speech formation, because of its anatomic aspects (particularly, the large volume of this muscular organ comparatively to the surrounding organs of the vocal tract) and also due to the wide range of movements and flexibility that are involved. In speech communication research, a variety of techniques have been used for measuring the three-dimensional vocal tract shapes. More recently, magnetic resonance imaging (MRI) becomes common; mainly, because this technique allows the collection of a set of static and dynamic images that can represent the entire vocal tract along any orientation. Over the years, different anatomical organs of the vocal tract have been modelled; namely, 2D and 3D tongue models, using parametric or statistical modelling procedures. Our aims are to present and describe some 3D reconstructed models from MRI data, for one subject uttering sustained articulations of some typical Portuguese sounds. Thus, we present a 3D database of the tongue obtained by stack combinations with the subject articulating Portuguese vowels. This 3D knowledge of the speech organs could be very important; especially, for clinical purposes (for example, for the assessment of articulatory impairments followed by tongue surgery in speech rehabilitation), and also for a better understanding of acoustic theory in speech formation

    On the role of oral configurations in European Portuguese nasal vowels

    Get PDF
    The characterisation of nasal vowels is not only a question ofstudying velar aperture. Recent work shows that oropharyngeal articulatory adjustments enhance the acoustics of nasal couplingor, at least, magnify differences between oral/nasal vowel congeners. Despite preliminary studies on the oral configurations of nasal vowels, for European Portuguese, a quantitative analysis is missing, particularly one to be applied systematically to a desirably large number of speakers. The main objective ofthis study is to adapt and extend previous methodological advances for the analysis of MRI data to further investigate: howvelar changes affect oral configurations; the changes to the articulators and constrictions when compared with oral counteparts; and the closest oral counterpart. High framerate RT-MRIimages (50fps) are automatically processed to extract the vocal tract contours and the position/configuration for the different articulators. These data are processed by evolving a quantitative articulatory analysis framework, previously proposed by the authors, extended to include information regarding constrictions (degree and place) and nasal port. For this study, while the analysis of data for more speakers is ongoing, we considered a set of two EP native speakers and addressed the study of oral and nasal vowels mainly in the context of stop consonants

    Velum movement detection based on surface electromyography for speech interface

    Get PDF
    Conventional speech communication systems do not perform well in the absence of an intelligible acoustic signal. Silent Speech Interfaces enable speech communication to take place with speech-handicapped users and in noisy environments. However, since no acoustic signal is available, information on nasality may be absent, which is an important and relevant characteristic of several languages, particularly European Portuguese. In this paper we propose a non-invasive method - surface Electromyography (EMG) electrodes - positioned in the face and neck regions to explore the existence of useful information about the velum movement. The applied procedure takes advantage of Real-Time Magnetic Resonance Imaging (RT-MRI) data, collected from the same speakers, to interpret and validate EMG data. By ensuring compatible scenario conditions and proper alignment between the EMG and RT-MRI data, we are able to estimate when the velum moves and the probable type of movement under a nasality occurrence. Overall results of this experiment revealed interesting and distinct characteristics in the EMG signal when a nasal vowel is uttered and that it is possible to detect velum movement, particularly by sensors positioned below the ear between the mastoid process and the mandible in the upper neck region.info:eu-repo/semantics/publishedVersio

    Towards a silent speech interface for Portuguese: Surface electromyography and the nasality challenge

    Get PDF
    A Silent Speech Interface (SSI) aims at performing Automatic Speech Recognition (ASR) in the absence of an intelligible acoustic signal. It can be used as a human-computer interaction modality in high-background-noise environments, such as living rooms, or in aiding speech-impaired individuals, increasing in prevalence with ageing. If this interaction modality is made available for users own native language, with adequate performance, and since it does not rely on acoustic information, it will be less susceptible to problems related to environmental noise, privacy, information disclosure and exclusion of speech impaired persons. To contribute to the existence of this promising modality for Portuguese, for which no SSI implementation is known, we are exploring and evaluating the potential of state-of-the-art approaches. One of the major challenges we face in SSI for European Portuguese is recognition of nasality, a core characteristic of this language Phonetics and Phonology. In this paper a silent speech recognition experiment based on Surface Electromyography is presented. Results confirmed recognition problems between minimal pairs of words that only differ on nasality of one of the phones, causing 50% of the total error and evidencing accuracy performance degradation, which correlates well with the exiting knowledge.info:eu-repo/semantics/acceptedVersio

    Magnetic resonance imaging of the vocal tract: techniques and applications

    Get PDF
    Magnetic resonance (MR) imaging has been used to analyse and evaluate the vocal tract shape through different techniques and with promising results in several fields. Our purpose is to demonstrate the relevance of MR and image processing for the vocal tract study. The extraction of contours of the air cavities allowed the set-up of a number of 3D reconstruction image stacks by means of the combination of orthogonally oriented sets of slices for each articulatory gesture, as a new approach to solve the expected spatial under sampling of the imaging process. In result these models give improved information for the visualization of morphologic and anatomical aspects and are useful for partial measurements of the vocal tract shape in different situations. Potential use can be found in Medical and therapeutic applications as well as in acoustic articulatory speech modelling

    Magnetic resonance imaging of the vocal tract: techniques and applications

    Get PDF
    Magnetic resonance (MR) imaging has been used to analyse and evaluate the vocal tract shape through different techniques and with promising results in several fields. Our purpose is to demonstrate the relevance of MR and image processing for the vocal tract study. The extraction of contours of the air cavities allowed the set - up of a number of 3D reconstruction image stacks by means of the combination of orthogonally oriented sets of slices for e ach articulatory gesture, as a new approach to solve the expected spatial under sampling of the imaging process. In result these models give improved information for the visualization of morphologic and anatomical aspects and are useful for partial measure ments of the vocal tract shape in different situations. Potential use can be found in Medical and therapeutic applications as well as in acoustic articulatory speech modelling

    Data-Driven Critical Tract Variable Determination for European Portuguese

    Get PDF
    Technologies, such as real-time magnetic resonance (RT-MRI), can provide valuable information to evolve our understanding of the static and dynamic aspects of speech by contributing to the determination of which articulators are essential (critical) in producing specific sounds and how (gestures). While a visual analysis and comparison of imaging data or vocal tract profiles can already provide relevant findings, the sheer amount of available data demands and can strongly profit from unsupervised data-driven approaches. Recent work, in this regard, has asserted the possibility of determining critical articulators from RT-MRI data by considering a representation of vocal tract configurations based on landmarks placed on the tongue, lips, and velum, yielding meaningful results for European Portuguese (EP). Advancing this previous work to obtain a characterization of EP sounds grounded on Articulatory Phonology, important to explore critical gestures and advance, for example, articulatory speech synthesis, entails the consideration of a novel set of tract variables. To this end, this article explores critical variable determination considering a vocal tract representation aligned with Articulatory Phonology and the Task Dynamics framework. The overall results, obtained considering data for three EP speakers, show the applicability of this approach and are consistent with existing descriptions of EP sounds

    Can ultrasonic doppler help detecting nasality for silent speech interfaces?: An exploratory analysis based on alignement of the doppler signal with velum aperture information from real-time MRI

    Get PDF
    This paper describes an exploratory analysis on the usefulness of the information made available from Ultrasonic Doppler signal data collected from a single speaker, to detect velum movement associated to European Portuguese nasal vowels. This is directly related to the unsolved problem of detecting nasality in silent speech interfaces. The applied procedure uses Real-Time Magnetic Resonance Imaging (RT-MRI), collected from the same speaker providing a method to interpret the reflected ultrasonic data. By ensuring compatible scenario conditions and proper time alignment between the Ultrasonic Doppler signal data and the RT-MRI data, we are able to accurately estimate the time when the velum moves and the type of movement under a nasal vowel occurrence. The combination of these two sources revealed a moderate relation between the average energy of frequency bands around the carrier, indicating a probable presence of velum information in the Ultrasonic Doppler signalinfo:eu-repo/semantics/acceptedVersio
    • …
    corecore