1,623 research outputs found
Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees
This paper proposes a voice morphing system for people suffering from
Laryngectomy, which is the surgical removal of all or part of the larynx or the
voice box, particularly performed in cases of laryngeal cancer. A primitive
method of achieving voice morphing is by extracting the source's vocal
coefficients and then converting them into the target speaker's vocal
parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping
the coefficients from source to destination. However, the use of the
traditional/conventional GMM-based mapping approach results in the problem of
over-smoothening of the converted voice. Thus, we hereby propose a unique
method to perform efficient voice morphing and conversion based on GMM,which
overcomes the traditional-method effects of over-smoothening. It uses a
technique of glottal waveform separation and prediction of excitations and
hence the result shows that not only over-smoothening is eliminated but also
the transformed vocal tract parameters match with the target. Moreover, the
synthesized speech thus obtained is found to be of a sufficiently high quality.
Thus, voice morphing based on a unique GMM approach has been proposed and also
critically evaluated based on various subjective and objective evaluation
parameters. Further, an application of voice morphing for Laryngectomees which
deploys this unique approach has been recommended by this paper.Comment: 6 pages, 4 figures, 4 tables; International Journal of Computer
Applications Volume 49, Number 21, July 201
Modeling Vocal Fold Motion with a New Hydrodynamic Semi-Continuum Model
Vocal fold (VF) motion is a fundamental process in voice production, and is
also a challenging problem for direct numerical computation because the VF
dynamics depend on nonlinear coupling of air flow with the response of elastic
channels (VF), which undergo opening and closing, and induce internal flow
separation. A traditional modeling approach makes use of steady flow
approximation or Bernoulli's law which is known to be invalid during VF
opening. We present a new hydrodynamic semi-continuum system for VF motion. The
airflow is modeled by a quasi-one dimensional continuum aerodynamic system, and
the VF by a classical lumped two mass system. The reduced flow system contains
the Bernoulli's law as a special case, and is derivable from the two
dimensional compressible Navier-Stokes equations. Since we do not make steady
flow approximation, we are able to capture transients and rapid changes of
solutions, e.g. the double pressure peaks at opening and closing stages of VF
motion consistent with experimental data. We demonstrate numerically that our
system is robust, and models in-vivo VF oscillation more physically. It is also
much simpler than a full two-dimensional Navier-Stokes system.Comment: 27 pages,6 figure
High Fidelity Computational Modeling and Analysis of Voice Production
This research aims to improve the fundamental understanding of the multiphysics nature of voice production, particularly, the dynamic couplings among glottal flow, vocal fold vibration and airway acoustics through high-fidelity computational modeling and simulations. Built upon in-house numerical solvers, including an immersed-boundary-method based incompressible flow solver, a finite element method based solid mechanics solver and a hydrodynamic/aerodynamic splitting method based acoustics solver, a fully coupled, continuum mechanics based fluid-structure-acoustics interaction model was developed to simulate the flow-induced vocal fold vibrations and sound production in birds and mammals. Extensive validations of the model were conducted by comparing to excised syringeal and laryngeal experiments. The results showed that, driven by realistic representations of physiology and experimental conditions, including the geometries, material properties and boundary conditions, the model had an excellent agreement with the experiments on the vocal fold vibration patterns, acoustics and intraglottal flow dynamics, demonstrating that the model is able to reproduce realistic phonatory dynamics during voice production. The model was then utilized to investigate the effect of vocal fold inner structures on voice production. Assuming the human vocal fold to be a three-layer structure, this research focused on the effect of longitudinal variation of layer thickness as well as the cover-body thickness ratio on vocal fold vibrations. The results showed that the longitudinal variation of the cover and ligament layers thicknesses had little effect on the flow rate, vocal fold vibration amplitude and pattern but affected the glottal angle in different coronal planes, which also influenced the energy transfer between glottal flow and the vocal fold. The cover-body thickness ratio had a complex nonlinear effect on the vocal fold vibration and voice production. Increasing the cover-body thickness ratio promoted the excitation of the wave-type modes of the vocal fold, which were also higher-eigenfrequency modes, driving the vibrations to higher frequencies. This has created complex nonlinear bifurcations. The results from the research has important clinical implications on voice disorder diagnosis and treatment as voice disorders are often associated with mechanical status changes of the vocal fold tissues and their treatment often focus on restoring the mechanical status of the vocal folds
Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference
Analysis of a soft bio-Inspired active actuation model for the design of artificial vocal folds
Phonation results from the passively induced oscillation of the vocal folds in the larynx, creating sound waves that are then articulated by the mouth and nose. Patients undergoing laryngectomy have their vocal folds removed and thus must rely on alternative sources of achieving the desired vibration of artificial vocal folds. Existing solutions, such as voice prostheses and the Electrolarynx, are limited by producing sufficient voice quality, for instance. In this paper, we present a mathematical analysis of a physical model of an active vocal fold prosthesis. The inverse dynamical equation of the system will help to understand whether specific types of soft actuators can produce the required force to generate natural phonations. Hence, this is referred to as the active actuation model. We present the analysis to replicate the vowels /a/, /e/, /i/, and /u/ and voice qualities of vocal fry, modal, falsetto, breathy, pressed, and whispery. These characteristics would be required as a first step to design an artificial vocal folds system. Inverse dynamics is used to identify the required forces to change the glottis area and frequencies to achieve sufficient oscillation of artificial vocal folds. Two types of ionic polymer-metal composite (IPMC) actuators are used to assess their ability to produce these forces and the corresponding activation voltages required. The results of our proposed analysis will enable research into the effects of natural phonation and, further, provide the foundational work for the creation of advanced larynx prostheses
A Cervid Vocal Fold Model Suggests Greater Glottal Efficiency in Calling at High Frequencies
Male Rocky Mountain elk (Cervus elaphus nelsoni) produce loud and high fundamental frequency bugles during the mating season, in contrast to the male European Red Deer (Cervus elaphus scoticus) who produces loud and low fundamental frequency roaring calls. A critical step in understanding vocal communication is to relate sound complexity to anatomy and physiology in a causal manner. Experimentation at the sound source, often difficult in vivo in mammals, is simulated here by a finite element model of the larynx and a wave propagation model of the vocal tract, both based on the morphology and biomechanics of the elk. The model can produce a wide range of fundamental frequencies. Low fundamental frequencies require low vocal fold strain, but large lung pressure and large glottal flow if sound intensity level is to exceed 70 dB at 10 m distance. A high-frequency bugle requires both large muscular effort (to strain the vocal ligament) and high lung pressure (to overcome phonation threshold pressure), but at least 10 dB more intensity level can be achieved. Glottal efficiency, the ration of radiated sound power to aerodynamic power at the glottis, is higher in elk, suggesting an advantage of high-pitched signaling. This advantage is based on two aspects; first, the lower airflow required for aerodynamic power and, second, an acoustic radiation advantage at higher frequencies. Both signal types are used by the respective males during the mating season and probably serve as honest signals. The two signal types relate differently to physical qualities of the sender. The low-frequency sound (Red Deer call) relates to overall body size via a strong relationship between acoustic parameters and the size of vocal organs and body size. The high-frequency bugle may signal muscular strength and endurance, via a ‘vocalizing at the edge’ mechanism, for which efficiency is critical
- …