2,782 research outputs found

    Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh

    Get PDF
    Articulatory speech synthesis has the potential to offer more natural sounding synthetic speech than established concatenative or parametric synthesis methods. Time-domain acoustic models are particularly suited to the dynamic nature of the speech signal, and recent work has demonstrated the potential of dynamic vocal tract models that accurately reproduce the vocal tract geometry. This paper presents a dynamic 3D digital waveguide mesh (DWM) vocal tract model, capable of movement to produce diphthongs. The technique is compared to existing dynamic 2D and static 3D DWM models, for both monophthongs and diphthongs. The results indicate that the proposed model provides improved formant accuracy over existing DWM vocal tract models. Furthermore, the computational requirements of the proposed method are significantly lower than those of comparable dynamic simulation techniques. This paper represents another step toward a fully functional articulatory vocal tract model which will lead to more natural speech synthesis systems for use across society

    High Fidelity Computational Modeling and Analysis of Voice Production

    Get PDF
    This research aims to improve the fundamental understanding of the multiphysics nature of voice production, particularly, the dynamic couplings among glottal flow, vocal fold vibration and airway acoustics through high-fidelity computational modeling and simulations. Built upon in-house numerical solvers, including an immersed-boundary-method based incompressible flow solver, a finite element method based solid mechanics solver and a hydrodynamic/aerodynamic splitting method based acoustics solver, a fully coupled, continuum mechanics based fluid-structure-acoustics interaction model was developed to simulate the flow-induced vocal fold vibrations and sound production in birds and mammals. Extensive validations of the model were conducted by comparing to excised syringeal and laryngeal experiments. The results showed that, driven by realistic representations of physiology and experimental conditions, including the geometries, material properties and boundary conditions, the model had an excellent agreement with the experiments on the vocal fold vibration patterns, acoustics and intraglottal flow dynamics, demonstrating that the model is able to reproduce realistic phonatory dynamics during voice production. The model was then utilized to investigate the effect of vocal fold inner structures on voice production. Assuming the human vocal fold to be a three-layer structure, this research focused on the effect of longitudinal variation of layer thickness as well as the cover-body thickness ratio on vocal fold vibrations. The results showed that the longitudinal variation of the cover and ligament layers thicknesses had little effect on the flow rate, vocal fold vibration amplitude and pattern but affected the glottal angle in different coronal planes, which also influenced the energy transfer between glottal flow and the vocal fold. The cover-body thickness ratio had a complex nonlinear effect on the vocal fold vibration and voice production. Increasing the cover-body thickness ratio promoted the excitation of the wave-type modes of the vocal fold, which were also higher-eigenfrequency modes, driving the vibrations to higher frequencies. This has created complex nonlinear bifurcations. The results from the research has important clinical implications on voice disorder diagnosis and treatment as voice disorders are often associated with mechanical status changes of the vocal fold tissues and their treatment often focus on restoring the mechanical status of the vocal folds

    Numerical Modeling of Vocal Control and Patient-specific Surgical Planning of Type 1 Thyroplasty

    Get PDF
    This study aims to develop knowledge about the roles of intrinsic laryngeal muscles on voice control in both healthy and disordered conditions through comprehensive computational models. The phonation simulator was built by combining a three-dimensional high-fidelity MRI-based model of the larynx, active muscle mechanics, and fluid-structure-acoustic interaction model, which enabled the exploration of the underlayer mechanisms of the link between individual and/or group muscles contractions under both symmetric and asymmetric activations, vocal fold posture, vocal fold vibration, and voice outcomes during voice production. The first part of this research extensively investigated the effects of cricothyroid and thyroarytenoid muscle activations on voice characteristics through a parametric study. The role of these intrinsic muscles in the adjustment of geometrical and mechanical properties of vocal fold pre-phonatory posture, glottic flow aerodynamics, and acoustic and how all these components interact were explored. Results were comprehensively validated, and the link between elements of phonation was described in detail. In the next step, due to the model\u27s ability in the individual muscle activations, unilateral vocal fold paralysis was simulated, and the characteristics of disordered voice were analyzed. The voice simulator was then combined with the implant insertion model and genetic algorithm method to build a computational framework for patient-specific surgical planning of type 1 thyroplasty. This surgery is a standard procedure for treating unilateral vocal fold paralysis; however, it is subject to challenges mainly due to the small size of the implant and the high sensitivity of the voice outcome to the implant shape and position. Therefore, although the patient\u27s voice could be improved, the results might not be as satisfying as expected. Despite actual surgery, with very little room for try and error, the ideal implant could be achieved by optimizing the implant based on the patient\u27s desired voice using the presented computational framework. Both healthy and diseased cases and the corrected case using the optimized implant were simulated. Results revealed that the optimized implant could restore the aerodynamic and acoustic features of the disordered voice in producing a sustained vowel utterance. Furthermore, the performance of the implant in the pitch gliding test, which was simulated using temporal activation of the cricothyroid and thyroarytenoid muscles based on the first part of the study, was evaluated. In the final step, a physics-informed neural network-based algorithm was presented to reconstruct the three-dimensional cyclic vibration of vocal fold using two-dimensional sparse experimental data and laws of physics. Key acoustic parameters and vibratory dynamics of vocal folds and other parameters, such as flow rate, pressure distribution, and contact force, which are difficult to measure experimentally, were successfully predicted

    Adaptive mesh simulations of compressible flows using stabilized formulations

    Get PDF
    This thesis investigates numerical methods that approximate the solution of compressible flow equations. The first part of the thesis is committed to studying the Variational Multi-Scale (VMS) finite element approximation of several compressible flow equations. In particular, the one-dimensional Burgers equation in the Fourier space, and the compressible Navier-Stokes equations written in both conservative and primitive variables are considered. The approximations made for the VMS formulation are extensively researched; the design of the matrix of stabilization parameters, the definition of the space where the subscales live, the inclusion of the temporal derivatives of the subscales, and the non-linear tracking of the subscales are formulated. Also, the addition of local artificial diffusion in the form of shock capturing techniques is included. The accuracy of the formulations is studied for several regimes of the compressible flow, from aeroacoustic flows at low Mach numbers to supersonic shocks. The second part of the thesis is devoted to make the solution of the smallest fluctuating scales of the compressible flow affordable. To this end, a novel algorithm for h−h-refinement of computational physics meshes in a distributed parallel setting, together with the solution of some refinement test cases in supercomputers are presented. The definition of an explicit a-posteriori error estimator that can be used in the adaptive mesh refinement simulations of compressible flows is also developed; the proposed methodology employs the variational subscales as a local error estimate that drives the mesh refinement. The numerical methods proposed in this thesis are capable to describe the high-frequency fluctuations of compressible flows, especially, the ones corresponding to complex aeroacoustic applications. Precisely, the direct simulation of the fricative [s] sound inside a realistic geometry of the human vocal tract is achieved at the end of the thesis.Esta tesis investiga métodos numéricos que aproximan la solución de las ecuaciones de flujo compresible. La primera parte de la tesis está dedicada al estudio de la aproximación numérica del flujo compresible por medio del método multiescala variacional (VMS) en elementos finitos. En particular, se consideran la ecuación de Burgers unidimensional descrita en el espacio de Fourier y las ecuaciones de Navier-Stokes de flujo compresible escritas en variables conservativas y primitivas. Las aproximaciones hechas para plantear la formulación VMS son ampliamente investigadas; el diseño de la matriz de parámetros de estabilización, la definición del espacio donde viven las subescalas, la inclusión de las derivadas temporales de las subescalas y el seguimiento no lineal de las subescalas son particularidades de la formulación que se analizan para cada una de las ecuaciones consideradas. Además, se incluye la adición de difusión artificial local en forma de técnicas de captura de choque. La precisión de las formulaciones se estudia para varios regímenes del flujo compresible, desde flujos aeroacústicos a bajos números de Mach hasta choques supersónicos. La segunda parte de la tesis está dedicada a hacer asequible la solución de las escalas fluctuantes más pequeñas del flujo compresible. Con este fin, se presenta un algoritmo novedoso para el refinamiento hh de las mallas de física computacional usadas en computación distribuida en paralelo. Además, se demuestra la solución en superordenadores de algunos casos de prueba del refinamiento de mallas. También se desarrolla la definición de un estimador de error explícito a posteriori que se puede usar en las simulaciones adaptativas de refinamiento de malla de flujos compresibles; la metodología propuesta emplea las subescalas variacionales como una estimación de error local que induce el refinamiento de la malla. Los métodos numéricos propuestos en esta tesis son capaces de describir las fluctuaciones de alta frecuencia de los flujos compresibles, especialmente los correspondientes a aplicaciones aeroacústicas complejas. Precisamente, la simulación directa del sonido consonántico fricativo [s] dentro de una geometría realista del tracto vocal humano se demuestra al final de la tesis

    Numerical simulation of aeroacoustics using the variational multiscale method : application to the problem of human phonation

    Get PDF
    The solution of the human phonation problem applying computational mechanics is covered by several research branches, such as Computational Fluid Dynamics (CFD), biomechanics or acoustics, among others. In the present thesis, the problem is approached from the Computational Aeroacoustics (CAA) point of view and the first main objective consists in developing numerical methods of general application that can take part in the solution of any scenario related to human phonation with a reasonable cost. In this sense, only the compressible Navier-Stokes equations can describe all flow and acoustic scales without any modeling, which is known as Direct Numerical Simulation (DNS), but its computational cost is usually unaffordable. Even in the case of a Large Eddy Simulation (LES), where the small scales are modeled, the cost can still be a handicap due to the complexity of the problem. This drawback gets worse in the low Mach regime due to the large disparity between flow velocity and sound speed, which leads to an ill-conditioning of the system of equations, specially for conservative schemes. At this point, it makes sense to move towards the incompressible flow approximation, bearing in mind the low velocities expected in human phonation problems. Incompressible flows do not yield any acoustics, for which a second problem containing the propagation of the sound sources needs to be modeled and solved. These are the so called hybrid methods, which allow a better conditioning of the problem by segregating flow and acoustic scales. Lighthill's analogy has been taken as starting point for the present work, but its restriction to free-field scenarios has motivated the extension of the method to arbitrary geometries and non-uniform flows. The first development in this direction consists in a splitting of Lighthill's analogy into a quadrupolar and dipolar component, which does not change the original problem but allows assessing the contribution of solid boundaries to the generation of sound. The second step consists in the development of a stabilized Finite Element (FEM) formulation for the Acoustic Perturbation Equations (APE) which account for non-uniform flows and perform a complete filtering of the acoustic scales. The final step assumes the compressible approach but omitting the energy equation and thus considering both flow and acoustic propagation as isentropic. In this case the solver is unified and hence a method for applying compatible boundary conditions for flow and acoustics has been developed. Moreover, the whole numerical framework has been extended to dynamic phonation cases, which require using an Arbitrary Lagrangian Eulerian (ALE) reference. Also, a novel remeshing strategy with conservative interpolation between meshes is presented. In the last chapter a challenging case in human phonation has been chosen for testing the developed computational framework: the fricative phoneme /s/. Unlike vowels, which are voiced sounds defined by a few characteristic frequencies, fricatives cannot be simulated as the propagation of a known analytic solution (glottal pulse) because the sound sources correspond to a wide range of turbulent scales. Therefore, a CFD calculation is mandatory in order to capture all relevant eddies behind the generation of sound. This problem is solved with an LES together with the Variational Multiscale (VMS) stabilization method as turbulence model, which is supplemented with several acoustic formulations when using incompressible flow. The analysis of the results focuses on the numerical representation of turbulence and the acoustic signal at the far-field, which has been compared to experimental recordings. Finally, the role of the upper incisors in the generation of the fricative sound has been evaluated. All simulations have been run with the parallel multiphysics FEM code FEMUSS, based on FORTRAN Object-Oriented-Programming land the OpenMPI parallel library.La solució del problema de la veu humana des de la mecànica computacional és objecte d'estudi per part de diverses disciplines, com per exemple la Dinàmica de Fluids Computacional (CFD), la biomecànica o l'acústica. En la present tesi s'encara el problema des de l'Aeroacústica Computacional (CAA) i el primer objectiu consisteix en desenvolupar mètodes numèrics d'aplicació general que puguin ser part de la solució, amb un cost computacional raonable, de qualsevol escenari relacionat amb la fonació humana. En aquest sentit, només les equacions de flux compressible de Navier-Stokes aconsegueixen descriure totes les escales alhora, tant les dinàmiques com les acústiques, sense recórrer a cap modelització, conegut com a Simulació Numèrica Directa (DNS), però el seu cost computacional és normalment inassumible. Fins i tot en el cas d'una Large Eddy Simulation (LES), on les escales petites són modelades, el cost pot resultar excessiu a causa de la complexitat del problema. Aquest fet encara és més accentuat en el règim de baix nombre de Mach donada la gran disparitat entre la velocitat del fluid i la del so i el conseqüent mal condicionament del sistema d'equacions, sobretot en esquemes conservatius. Per tant, tenint en compte les baixes velocitats de l'aire al tracte vocal, té sentit recórrer a l'aproximació de flux incompressible. Els fluids incompressibles no inclouen la part acústica, de manera que cal calcular un segon problema que descrigui la propagació de les fonts de so. Aquests són els anomenats mètodes híbrids, que permeten un millor condicionament del problema gràcies a la segregació de les escales acústiques de les dinàmiques. S'ha pres l'analogia de Lighthill com a punt de partida, però la seva restricció a casos en camp obert ha motivat l'extensió del mètode cap a geometries arbitràries i fluxos no uniformes. El primer desenvolupament en aquesta direcció consisteix en la divisió de l'analogia de Lighthill en una component quadrupolar i una altra de dipolar, fet que no altera el problema original però que permet analitzar la contribució de cossos sòlids en la generació de so. El segon pas consisteix en el desenvolupament d'una formulació estabilitzada en elements finits (FEM) de les Acoustic Perturbation Equations (APE), que incorporen la propagació en fluxos no uniformes i que realitzen un filtrat complet de les escales acústiques. El pas final assumeix la compressibilitat del fluid però omet l'equació d'energia, i per tant considera la dinàmica i l'acústica fenòmens isentròpics. En aquest cas el solver és unificat i per tant s'ha desenvolupat un mètode per imposar condicions de contorn compatibles entre ambdues escales del fluid. Finalment, les formulacions numèriques han estat adaptades a casos de fonació dinàmica mitjançant una referència Arbitrària Lagrangiana Euleriana (ALE). A més, es presenta una estratègia de remallat amb interpolació conservativa entre malles. En l'últim capítol es presenta un cas de fonació humana que suposa un repte per la seva complexitat i que ha servit per validar les formulacions numèriques presentades: la fricativa sorda /s/. A diferència de les vocals, que són sons sonors definits per unes poques freqüències característiques, les fricatives no poden ser simulades com la propagació d'una funció analítica coneguda (pols glotal) perquè les fonts de so corresponen a un rang ampli d'escales turbulents. Per tant és necessària una simulació CFD per tal de capturar-les. El problema se soluciona amb un model de turbulència LES amb el mètode d'estabilització Variational Multiscale. L'anàlisi se centra en la representació numèrica de la turbulència i en el senyal acústic al camp llunyà, tot comparant-lo amb dades experimentals. Finalment, s'avalua la contribució dels incisius superiors en la generació del so fricatiu sord /s/. Totes les simulacions han estat realitzades amb el codi FEM multi-físic en paral·lel FEMUSS, basat en programació orientada a objectes en FORTRAN i en OpenMPI

    Computational Investigations of the Fluid-Structure Interaction During Phonation: The Role of Vocal Fold Elasticity and Glottal Flow Unsteadiness

    Get PDF
    Human voice production arises from the biomechanical interaction between vocal fold vibrations and airflow dynamics. Changes in vocal fold stiffness can lead to changes in vocal fold vibration patterns and further changes in voice outcomes. A good knowledge of the cause-and-effect relationship between vocal fold stiffness and voice production can not only deepen the understanding of voice production mechanisms but also benefit the treatment of voice disorders associated with vocal fold stiffness changes. This constitutes the first objective of this dissertation. The second objective of this dissertation is to further examine the range of validity of the quasi-steady assumption of glottal flow during phonation. The assumption is of vital importance for phonation modeling since it enables to eliminate the unsteady aspects of glottal flow, which greatly simplifies the flow modeling. A three-dimensional flow-structure interaction model of voice production is employed to investigate the effects of vocal fold stiffness parameters on voice production. The vocal fold is modeled as the cover-ligament-body structure with a transversely isotropic constitutive relation. Stiffness parameters in both the transverse plane and the longitudinal direction of each layer of the vocal fold are systematically varied. The results show that varying the stiffness parameters has obvious monotonic effects on the fundamental frequency, glottal flow rate and glottal opening, but has non-monotonic effects on the glottal divergent angle, open quotient and closing velocity. Compared to the transverse stiffness parameters, the longitudinal stiffness parameters generally have more significant impacts on glottal flows and vocal fold vibrations. Additionally, the sensitivity analysis reveals that the stiffness parameters of the ligament layer have the largest effect on most output measures. Next, flow-structure interaction simulations are carried out to study the effect of fiber orientation in the conus elasticus on voice production. Two continuum vocal fold models with different fiber orientations in the conus elasticus are constructed. The more realistic fiber orientation (caudal-cranial) in the conus elasticus is found to yield smaller structural stiffness and larger deflection at the junction of the conus elasticus and ligament than the anterior-posterior fiber orientation, which facilitates vocal fold vibrations and eventually causes a larger peak flow rate and higher speed quotient. The generated voice is also found to have a lower fundamental frequency and smaller spectral slope. Finally, the validity of the quasi-steady assumption for glottal flow is systematically examined by considering the voice frequency range, complexity of glottal shapes and air inertia in the vocal tract. The results show that at the normal speech frequency (~ 100 Hz), the dynamics of the quasi-steady flow greatly resembles that of a dynamic flow, and the glottal flow and glottal pressure predicted by the quasi-steady approximation have very small errors. However, the assumption produces huge errors at high frequencies (~ 500 Hz). In addition, air inertia in the vocal tract can undermine the validity of the assumption via the nonlinear interaction with the unsteady glottal flow. The role of glottal shapes in the validation is found to be insignificant

    The comparison of different acoustic approaches in the simulation of human phonation

    Get PDF
    This contribution deals with mathematical modelling and numerical simula- tion of the human phonation process. This phenomena is described as a coupled problem composed of the three mutually coupled physical fields: the deformation of elastic body, the fluid flow and the acoustics. For the sake of simplicity only a two-dimensional model problems is considered in this paper. The fluid-structure interaction problem is described by the incompressible Navier-Stokes equations, by the linear elasticity theory and by the interface conditions. In order to capture the motion of the fluid domain the arbitrary Lagrangian-Eulerian method is used. The strongly coupled partitioned scheme is used for solution of the coupled fluid-structure problem. For solution of acoustics the acoustic analogies are used. Two analogies are compared - the Lighthill analogy and convected perturbation wave equation. The influence of acoustic field back to fluid as well as to structure is neglected. The numerical approximation of all three physical domains is per- foremd with the aid of the finite element method. The numerical results present sound propagation through the model of the vocal tract
    • …
    corecore