1,623 research outputs found

    Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees

    Full text link
    This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM,which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper.Comment: 6 pages, 4 figures, 4 tables; International Journal of Computer Applications Volume 49, Number 21, July 201

    Modeling Vocal Fold Motion with a New Hydrodynamic Semi-Continuum Model

    Full text link
    Vocal fold (VF) motion is a fundamental process in voice production, and is also a challenging problem for direct numerical computation because the VF dynamics depend on nonlinear coupling of air flow with the response of elastic channels (VF), which undergo opening and closing, and induce internal flow separation. A traditional modeling approach makes use of steady flow approximation or Bernoulli's law which is known to be invalid during VF opening. We present a new hydrodynamic semi-continuum system for VF motion. The airflow is modeled by a quasi-one dimensional continuum aerodynamic system, and the VF by a classical lumped two mass system. The reduced flow system contains the Bernoulli's law as a special case, and is derivable from the two dimensional compressible Navier-Stokes equations. Since we do not make steady flow approximation, we are able to capture transients and rapid changes of solutions, e.g. the double pressure peaks at opening and closing stages of VF motion consistent with experimental data. We demonstrate numerically that our system is robust, and models in-vivo VF oscillation more physically. It is also much simpler than a full two-dimensional Navier-Stokes system.Comment: 27 pages,6 figure

    High Fidelity Computational Modeling and Analysis of Voice Production

    Get PDF
    This research aims to improve the fundamental understanding of the multiphysics nature of voice production, particularly, the dynamic couplings among glottal flow, vocal fold vibration and airway acoustics through high-fidelity computational modeling and simulations. Built upon in-house numerical solvers, including an immersed-boundary-method based incompressible flow solver, a finite element method based solid mechanics solver and a hydrodynamic/aerodynamic splitting method based acoustics solver, a fully coupled, continuum mechanics based fluid-structure-acoustics interaction model was developed to simulate the flow-induced vocal fold vibrations and sound production in birds and mammals. Extensive validations of the model were conducted by comparing to excised syringeal and laryngeal experiments. The results showed that, driven by realistic representations of physiology and experimental conditions, including the geometries, material properties and boundary conditions, the model had an excellent agreement with the experiments on the vocal fold vibration patterns, acoustics and intraglottal flow dynamics, demonstrating that the model is able to reproduce realistic phonatory dynamics during voice production. The model was then utilized to investigate the effect of vocal fold inner structures on voice production. Assuming the human vocal fold to be a three-layer structure, this research focused on the effect of longitudinal variation of layer thickness as well as the cover-body thickness ratio on vocal fold vibrations. The results showed that the longitudinal variation of the cover and ligament layers thicknesses had little effect on the flow rate, vocal fold vibration amplitude and pattern but affected the glottal angle in different coronal planes, which also influenced the energy transfer between glottal flow and the vocal fold. The cover-body thickness ratio had a complex nonlinear effect on the vocal fold vibration and voice production. Increasing the cover-body thickness ratio promoted the excitation of the wave-type modes of the vocal fold, which were also higher-eigenfrequency modes, driving the vibrations to higher frequencies. This has created complex nonlinear bifurcations. The results from the research has important clinical implications on voice disorder diagnosis and treatment as voice disorders are often associated with mechanical status changes of the vocal fold tissues and their treatment often focus on restoring the mechanical status of the vocal folds

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Analysis of a soft bio-Inspired active actuation model for the design of artificial vocal folds

    Get PDF
    Phonation results from the passively induced oscillation of the vocal folds in the larynx, creating sound waves that are then articulated by the mouth and nose. Patients undergoing laryngectomy have their vocal folds removed and thus must rely on alternative sources of achieving the desired vibration of artificial vocal folds. Existing solutions, such as voice prostheses and the Electrolarynx, are limited by producing sufficient voice quality, for instance. In this paper, we present a mathematical analysis of a physical model of an active vocal fold prosthesis. The inverse dynamical equation of the system will help to understand whether specific types of soft actuators can produce the required force to generate natural phonations. Hence, this is referred to as the active actuation model. We present the analysis to replicate the vowels /a/, /e/, /i/, and /u/ and voice qualities of vocal fry, modal, falsetto, breathy, pressed, and whispery. These characteristics would be required as a first step to design an artificial vocal folds system. Inverse dynamics is used to identify the required forces to change the glottis area and frequencies to achieve sufficient oscillation of artificial vocal folds. Two types of ionic polymer-metal composite (IPMC) actuators are used to assess their ability to produce these forces and the corresponding activation voltages required. The results of our proposed analysis will enable research into the effects of natural phonation and, further, provide the foundational work for the creation of advanced larynx prostheses

    A Cervid Vocal Fold Model Suggests Greater Glottal Efficiency in Calling at High Frequencies

    Get PDF
    Male Rocky Mountain elk (Cervus elaphus nelsoni) produce loud and high fundamental frequency bugles during the mating season, in contrast to the male European Red Deer (Cervus elaphus scoticus) who produces loud and low fundamental frequency roaring calls. A critical step in understanding vocal communication is to relate sound complexity to anatomy and physiology in a causal manner. Experimentation at the sound source, often difficult in vivo in mammals, is simulated here by a finite element model of the larynx and a wave propagation model of the vocal tract, both based on the morphology and biomechanics of the elk. The model can produce a wide range of fundamental frequencies. Low fundamental frequencies require low vocal fold strain, but large lung pressure and large glottal flow if sound intensity level is to exceed 70 dB at 10 m distance. A high-frequency bugle requires both large muscular effort (to strain the vocal ligament) and high lung pressure (to overcome phonation threshold pressure), but at least 10 dB more intensity level can be achieved. Glottal efficiency, the ration of radiated sound power to aerodynamic power at the glottis, is higher in elk, suggesting an advantage of high-pitched signaling. This advantage is based on two aspects; first, the lower airflow required for aerodynamic power and, second, an acoustic radiation advantage at higher frequencies. Both signal types are used by the respective males during the mating season and probably serve as honest signals. The two signal types relate differently to physical qualities of the sender. The low-frequency sound (Red Deer call) relates to overall body size via a strong relationship between acoustic parameters and the size of vocal organs and body size. The high-frequency bugle may signal muscular strength and endurance, via a ‘vocalizing at the edge’ mechanism, for which efficiency is critical
    • …
    corecore