25 research outputs found
Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings
International audiencePrevious research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of 'zippering' closure along the anterior-posterior (A-P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24-10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A-P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A-P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A-P phase differences
The Flow and Pressure Relationships in Different Tubes Commonly Used for Semi-occluded Vocal Tract Exercises
This experimental study investigated the back pressure (pback) versus flow (U) relationship for 10 different tubes commonly used for semi-occluded vocal tract exercises (SOVTE), i.e., 8 straws of different lengths and diameters, a resonance tube and a silicone tube similar to a Lax Vox tube. All tubes were assessed with the free end in air. The resonance tube and silicone tube were further assessed with the free end under water at the depths from 1 to 7 cm in steps of 1 cm. The results showed that relative changes in the diameter of straws affect pback considerably more compared to the same amount of relative change in length. Additionally, once tubes are submerged into water, pback needs to overcome the pressure generated by the water depth before flow can start. Under this condition, only a small increase in pback was observed as the flow was increased. Therefore, the wider tubes submerged into water produced an almost constant pback determined by the water depth, while the thinner straws in air produced relatively large changes to pback as flow was changed. These differences may be taken advantage of when customizing exercises for different users and diagnoses and optimizing the therapy outcome
Measurement of Vocal Doses in Speech: Experimental Procedure and Signal Processing
An experimental method for quantifying the amount of voicing over time is described in a tutorial manner. A new procedure for obtaining calibrated sound pressure levels (SPL) of speech from a head-mounted microphone is offered. An algorithm for voicing detection (kc and fundamental frequency (F0) extraction from an electroglottographic signal is described. The extracted values of SPL, F0, and kc are used to derive five vocal doses: the time dose (total voicing time), the cycle dose (total number of vocal fold oscillatory cycles), the distance dose (total distance travelled by the vocal folds in an oscillatory path), the energy dissipation dose (total amount of heat energy dissipated in the vocal folds) and the radiated energy dose (total acoustic energy radiated from the mouth). The doses measure the vocal load and can be used for studying the effects of vocal fold tissue exposure to vibration
Vocal Dose Measures: Quantifying Accumulated Vibration Exposure in Vocal Fold Tissues
To measure the exposure to self-induced tissue vibration in speech, three vocal doses were defined and described: distance dose, which accumulates the distance that tissue particles of the vocal folds travel in an oscillatory trajectory; energy dissipation dose, which accumulates the total amount of heat dissipated over a unit volume of vocal fold tissues; and time dose, which accumulates the total phonation time. These doses were compared to a previously used vocal dose measure, the vocal loading index, which accumulates the number of vibration cycles of the vocal folds. Empirical rules for viscosity and vocal fold deformation were used to calculate all the doses from the fundamental frequency (F0) and sound pressure level (SPL) values of speech. Six participants were asked to read in normal, monotone, and exaggerated speech and the doses associated with these vocalizations were calculated. The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient. The vibration exposure of the vocal folds in normal speech was related to the industrial limits for hand-transmitted vibration, in which the safe distance dose was derived to be about 500 m. This limit was found rather low for vocalization; it was related to a comparable time dose of about 17 min of continuous vocalization, or about 35 min of continuous reading with normal breathing and unvoiced segments. The voicing pauses in normal speech and dialogue effectively prolong the safe time dose. The derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal fold tissue during voicing pauses
Observational study of differences in head position for high notes in famous classical and non-classical male singers
<p><i>Introduction.</i> Differences in classical and non-classical singing are due primarily to aesthetic style requirements. The head position can affect the sound quality. This study aimed at comparing the head position for famous classical and non-classical male singers performing high notes.</p> <p><i>Method.</i> Images of 39 Western classical and 34 non-classical male singers during live performances were obtained from YouTube. Ten raters evaluated the frontal rotational head position (depression versus elevation) and transverse head position (retraction versus protraction) visually using a visual analogue scale.</p> <p><i>Results.</i> The results showed a significant difference for frontal rotational head position.</p> <p><i>Discussion and conclusion.</i> Most non-classical singers in the sample elevated their heads for high notes while the classical singers were observed to keep it around the neutral position. This difference may be attributed to different singing techniques and phonatory system adjustments utilized by each group.</p
Finite-element modeling of vocal fold self-oscillations in interaction with vocal tract: Comparison of incompressible and compressible flow model
Finite-element modeling of self-sustained vocal fold oscillations during voice production has mostly considered the air as incompressible, due to numerical complexity. This study overcomes this limitation and studies the influence of air compressibility on phonatory pressures, flow and vocal fold vibratory characteristics. A two-dimensional finite-element model is used, which incorporates layered vocal fold structure, vocal fold collisions, large deformations of the vocal fold tissue, morphing the fluid mesh according to the vocal fold motion by the arbitrary Lagrangian-Eulerian approach and vocal tract model of Czech vowel [i:] based on data from magnetic resonance images. Unsteady viscous compressible or incompressible airflow is described by the Navier-Stokes equations. An explicit coupling scheme with separated solvers for structure and fluid domain was used for modeling the fluid-structure-acoustic interaction. Results of the simulations show clear differences in the glottal flow and vocal fold vibration waveforms between the incompressible and compressible fluid flow. These results provide the evidence on the existence of the coupling between the vocal tract acoustics and the glottal flow (Level 1 interactions), as well as between the vocal tract acoustics and the vocal fold vibrations (Level 2 interactions)
Finite element modelling of vocal tract changes after voice therapy
Two 3D finite element (FE) models were constructed, based on CT measurements of a subject phonating on [a:]
before and after phonation into a tube. Acoustic analysis was performed by exciting the models with acoustic
flow velocity at the vocal folds. The generated acoustic pressure of the response was computed in front of the
mouth and inside the vocal tract for both FE models. Average amplitudes of the pressure oscillations inside the
vocal tract and in front of the mouth were compared to display the cost-efficiency of sound energy transfer at
different formant frequencies. The formants F1–F3 correspond to classical vibration modes also solvable by 1D
vocal tract model. However, for higher formants, there occur more complicated transversal modes which require
3D modelling. A special attention is given to the higher frequency range (above 3.5 Hz) where transversal modes
exist between piriform sinuses and valleculae. Comparison of the pressure oscillation inside and outside the vocal
tract showed that formants differ in their efficiency, F4 (at about 3.5 kHz, i.e. at the speaker’s or singer’s formant
region) being the most effective. The higher formants created a clear formant cluster around 4 kHz after the vocal
exercise with the tube. Since the human ear is most sensitive to frequencies between 2 and 4 kHz concentration of
sound energy in this frequency region (F4–F5) is effective for communication. The results suggest that exercising
using phonation into tubes help in improving the vocal economy