36 research outputs found

    Fitting a biomechanical model of the folds to high-speed video data through bayesian estimation

    Get PDF
    High-speed video recording of the vocal folds during sustained phonation has become a widespread diagnostic tool, and the development of imaging techniques able to perform automated tracking and analysis of relevant glottal cues, such as folds edge position or glottal area, is an active research field. In this paper, a vocal folds vibration analysis method based on the processing of visual data through a biomechanical model of the layngeal dynamics is proposed. The procedure relies on a Bayesian non-stationary estimation of the biomechanical model parameters and state, to fit the folds edge position extracted from the high-speed video endoscopic data. This finely tuned dynamical model is then used as a state transition model in a Bayesian setting, and it allows to obtain a physiologically motivated estimation of upper and lower vocal folds edge position. Based on model prediction, an hypothesis on the lower fold position can be made even in complete fold occlusion conditions occurring during the end of the closed phase and the beginning of the open phase of the glottal cycle. To demonstrate the suitability of the procedure, the method is assessed on a set of audiovisual recordings featuring high-speed video endoscopic data from healthy subjects producing sustained voiced phonation with different laryngeal settings

    A high-speed laryngoscopic investigation of aryepiglottic trilling

    No full text
    Six aryepiglottic trills with varied laryngeal parameters were recorded using high-speed laryngoscopy to investigate the nature of the oscillatory behavior of the upper margin of the epilaryngeal tube. Image analysis techniques were applied to extract data about the patterns of aryepiglottic fold oscillation, with a focus on the oscillatory frequencies of the folds. The acoustic impact of aryepiglottic trilling is also considered, along with possible interactions between the aryepiglottic vibration and vocal fold vibration during the voiced trill. Overall, aryepiglottic trilling is deemed to be correctly labeled as a trill in phonetic terms, while also acting as a means to alter the quality of voicing to be auditorily harsh. In terms of its characterization, aryepiglottic vibration is considerably irregular, but it shows indications of contributing quasi-harmonic excitation of the vocal tract, particularly noticeable under conditions of glottal voicelessness. Aryepiglottic vibrations appear to be largely independent of glottal vibration in terms of oscillatory frequency but can be increased in frequency by increasing overall laryngeal constriction. There is evidence that aryepiglottic vibration induces an alternating vocal fold vibration pattern. It is concluded that aryepiglottic trilling, like ventricular phonation, should be regarded as a complex, if highly irregular, sound source

    Svane, Niels

    Get PDF

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Harsh voice quality and its association with blackness in popular American media

    No full text
    Performers use various laryngeal settings to create voices for characters and personas they portray. Although some research demonstrates the sociophonetic associations of laryngeal voice quality, few studies have documented or examined the role of harsh voice quality, particularly with vibration of the epilaryngeal structures (growling). This article qualitatively examines phonetic properties of vocal performances in a corpus of popular American media and evaluates the association of voice qualities in these performances with representations of social identity and stereotype. In several cases, contrasting laryngeal states create sociophonetic contrast, and harsh voice quality is paired with the portrayal of racial stereotypes of black people. These cases indicate exaggerated emotional states and are associated with yelling/shouting modes of expression. Overall, however, the functioning of harsh voice quality as it occurs in the data is broader and may involve aggressive posturing, comedic inversion of aggressiveness, vocal pathology, and vocal homag

    Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings

    No full text
    International audiencePrevious research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of 'zippering' closure along the anterior-posterior (A-P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24-10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A-P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A-P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A-P phase differences

    Universal mechanisms of sound production and control in birds and mammals

    Get PDF
    As animals vocalize, their vocal organ transforms motor commands into vocalizations for social communication. In birds, the physical mechanisms by which vocalizations are produced and controlled remain unresolved because of the extreme difficulty in obtaining in vivo measurements. Here, we introduce an ex vivo preparation of the avian vocal organ that allows simultaneous high-speed imaging, muscle stimulation and kinematic and acoustic analyses to reveal the mechanisms of vocal production in birds across a wide range of taxa. Remarkably, we show that all species tested employ the myoelastic-aerodynamic (MEAD) mechanism, the same mechanism used to produce human speech. Furthermore, we show substantial redundancy in the control of key vocal parameters ex vivo, suggesting that in vivo vocalizations may also not be specified by unique motor commands. We propose that such motor redundancy can aid vocal learning and is common to MEAD sound production across birds and mammals, including humans

    Compensatory Vocal Folds for Source Voice Generation: Computational Modeling of Vocal Folds Function

    Get PDF
    Práce se zabývá výpočtovým modelováním funkce lidských hlasivek a vokálního traktu s využitím metody konečných prvků (MKP). Hlas hraje klíčovou roli v lidské komunikaci. Proto je jedním z důležitých cílů současné medicíny vytvořit umělé hlasivky, které by mohly být implantovány pacientům, kterým musely být odstraněny jejich hlasivky původní. Pro pochopení principů tvorby hlasu, určení parametrů, které musí umělé hlasivky splňovat a ověření jejich funkčnosti je možno využít výpočtového modelování. První část práce se zabývá výpočtovým modelováním pro tvorbu lidského hlasu šeptem. V této kapitole byl na MKP modelu vokálního traktu a průdušnice zkoumán vliv velikosti mezihlasivkové mezery na rozložení vlastních frekvencí pro jednotlivé samohlásky. Dále je v práci prezentován rovinný (2D) konečnoprvkový model samobuzeného kmitání lidských hlasivek v interakci s akustickými prostory vokálního traktu. Rovinný model vokálního traktu byl vytvořen na základě snímků z magnetické rezonance (MRI). Pro řešení interakce mezi strukturou a tekutinou je použito explicitní výpočtové schéma s oddělenými řešiči pro strukturu a pro proudění. Vytvořený výpočtový model zahrnuje: velké deformace tkáně hlasivek, kontakt mezi hlasivkami, interakci mezi strukturou a tekutinou, morfování sítě vzduchu podle pohybu hlasivek (metoda Arbitrary Lagrangian-Eulerian), neustálené viskózní a stlačitelné nebo nestlačitelné proudění popsané pomocí Navier-Stokesových rovnic a přerušování proudu vzduchu během uzavření hlasivek. Na tomto modelu jsou zkoumány projevy změn tuhosti a tlumení jednotlivých vrstev (zejména pak laminy proprii). Součástí této výpočtové analýzy je také porovnání chování hlasivek pro stlačitelný a nestlačitelný model proudění. Ze získaných výsledků výpočtu MKP modelu jsou následně vytvářeny videokymogramy (VKG), které umožňují porovnat pohyb mezi jednotlivými variantami modelu a se skutečnými lidskými hlasivkami. V další části práce je potom prezentován prostorový (3D) MKP model samobuzeného kmitání lidských hlasivek. Tento prostorový model vznikl z předchozího rovinného modelu vytažením do třetího rozměru. Na tomto modelu byl opět porovnáván vliv použití stlačitelného a nestlačitelného modelu proudění na pohyb hlasivek a vytvářený zvuk s využitím videokymogramů a zvukových spekter. Poslední část práce se zabývá jednou z možností náhrady přirozeného zdrojového hlasu v podobě plátkového elementu. Chování plátkového elementu bylo zkoumáno na výpočtovém a experimentálním modelu. Experimentální model umožňuje změny v nastavení vzájemné polohy plátku vůči dorazu a provádění akustických a optických měření.This doctoral thesis focuses on computational modelling of human vocal folds and vocal tract functions using finite element method (FEM). Human voice is crucial in human communication. Therefore one of the main targets of current medicine is creation of artificial vocal folds, which would substitute the original vocal folds. The computational modelling can be used to understand principles of voice production, determination of parameters that the artificial vocal folds have to meet and verification of their functionality. First part of this thesis focuses on modelling of human voice creation by whisper. Influence of intraglottal gap on eigenvalues distribution for individual vowels was analysed using FEM vocal tract and trachea model. Further there is presented two-dimensional (2D) finite element model of the flow-induced self-oscillation of the human vocal folds in interaction with acoustic spaces of the vocal tract. The 2D vocal tract model was created on the basis of converting the data from magnetic resonance images (MRI). Explicit coupling scheme with separated solvers for structure and fluid domain was used for modelling of the fluid-structure interaction. Created computational model comprises: large deformations of the vocal folds tissue, contact between vocal folds, fluid-structure interaction, morphing the fluid mesh according to the vocal-fold motion (Arbitrary Lagrangian-Eulerian approach), unsteady viscous compressible or incompressible airflow described by the Navier-Stokes equations and airflow separation during glottis closure. This model is used to analyse the influence of stiffness and damping changes in individual vocal fold tissue layers (in particular in superficial lamina propria). Part of this computational analysis is also comparison of vocal folds behaviour for compressible and incompressible flow model. Videokymograms (VKG) are subsequently created from obtained results of FEM calculations which enable to compare individual variants between themselves and with motion of real human vocal folds. In next part of this thesis is presented three-dimensional (3D) finite element model of the flow-induced self-oscillation of the human vocal folds. This 3D model was created from a previous 2D model by extrude to the third direction. Using this model was again compared influence of compressible and incompressible flow model on vocal folds motion and generated sound by using videokymograms and acoustic spectra. The last part of this thesis focuses on the possibility to replace missing natural source voice in form reed-based element. Behaviour of reed-based element was analysed using computational modelling and using measurements on experimental physical model. The physical model enables changes in setting gap between reed and reed stop and performing acoustical and optical measurements.
    corecore