216 research outputs found
Aspects of voice irregularity measurement in connected speech
Applications of the use of connected speech material for the objective assessment of two primary physical aspects of voice quality are described and discussed. Simple auditory perceptual criteria are employed to guide the choice of analysis parameters for the physical correlate of pitch, and their utility is investigated by the measurement of the characteristics of particular examples of the normal-speaking voice. This approach is extended to the measurement of vocal fold contact phase control in connected speech and both techniques are applied to pathological voice data
Closing and opening phase variability in dysphonia
Four examples of the use of vocal fold contact phase measurement are discussed for unilateral paresis. In each case this aspect of voice quality is of greater importance than the physical measurement of loudness and pitch related parameters. For three of the cases electro-stimulation has been used as a main part of the treatment. Phonation in both connected speech and, for comparison, in sustained sound production has been used with electro-laryngograph / egg signals providing the basis for measurement. The main new descriptors that have been found to be useful relate to: vocal fold closure and closure duration regularities and distributions; but reference is also made to related measures of peak acoustic amplitude. The new measures described give, in some cases, quite striking results that are of auditory significance and potentially of clinical value
Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audiovisual and auditory speech perception
Auditory and audio-visual speech perception was investigated using auditory signals of invariant spectral envelope that temporally encoded the presence of voiced and voiceless excitation, variations in amplitude envelope and F-0. In experiment 1, the contribution of the timing of voicing was compared in consonant identification to the additional effects of variations in F-0 and the amplitude of voiced speech. In audio-visual conditions only, amplitude variation slightly increased accuracy globally and for manner features. F-0 variation slightly increased overall accuracy and manner perception in auditory and audio-visual conditions. Experiment 2 examined consonant information derived from the presence and amplitude variation of voiceless speech in addition to that from voicing, F-0, and voiced speech amplitude. Binary indication of voiceless excitation improved accuracy overall and for voicing and manner. The amplitude variation of voiceless speech produced only a small increment in place of articulation scores. A final experiment examined audio-visual sentence perception using encodings of voiceless excitation and amplitude variation added to a signal representing voicing and F-0. There was a contribution of amplitude variation to sentence perception, but not of voiceless excitation. The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity. (C) 1999 Acoustical Society of America. [S0001-4966(99)01410-1]
Gaussian Process Regression models for the properties of micro-tearing modes in spherical tokamak
Spherical tokamaks (STs) have many desirable features that make them an
attractive choice for a future fusion power plant. Power plant viability is
intrinsically related to plasma heat and particle confinement and this is often
determined by the level of micro-instability driven turbulence. Accurate
calculation of the properties of turbulent micro-instabilities is therefore
critical for tokamak design, however, the evaluation of these properties is
computationally expensive. The considerable number of geometric and
thermodynamic parameters and the high resolutions required to accurately
resolve these instabilities makes repeated use of direct numerical simulations
in integrated modelling workflows extremely computationally challenging and
creates the need for fast, accurate, reduced-order models.
This paper outlines the development of a data-driven reduced-order model,
often termed a {\it surrogate model} for the properties of micro-tearing modes
(MTMs) across a spherical tokamak reactor-relevant parameter space utilising
Gaussian Process Regression (GPR) and classification; techniques from machine
learning. These two components are used in an active learning loop to maximise
the efficiency of data acquisition thus minimising computational cost. The
high-fidelity gyrokinetic code GS2 is used to calculate the linear properties
of the MTMs: the mode growth rate, frequency and normalised electron heat flux;
core components of a quasi-linear transport model. Five-fold cross-validation
and direct validation on unseen data is used to ascertain the performance of
the resulting surrogate models
Accuracy and variability of acoustic measures of voicing onset
Five commonly used methods for determining the onset of voicing of syllable-initial stop consonants were compared. The speech and glottal activity of 16 native speakers of Cantonese with normal voice quality were investigated during the production of consonant vowel (CV) syllables in Cantonese. Syllables consisted of the initial consonants /ph/, /th/, /kh/, /p/, /t/, and /k/ followed by the vowel /a/. All syllables had a high level tone, and were all real words in Cantonese. Measurements of voicing onset were made based on the onset of periodicity in the acoustic waveform, and on spectrographic measures of the onset of a voicing bar (f0), the onset of the first formant (F1), second formant (F2), and third formant (F3). These measurements were then compared against the onset of glottal opening as determined by electroglottography. Both accuracy and variability of each measure were calculated. Results suggest that the presence of aspiration in a syllable decreased the accuracy and increased the variability of spectrogram-based measurements, but did not strongly affect measurements made from the acoustic waveform. Overall, the acoustic waveform provided the most accurate estimate of voicing onset; measurements made from the amplitude waveform were also the least variable of the five measures. These results can be explained as a consequence of differences in spectral tilt of the voicing source in breathy versus modal phonation. ©2003 Acoustical Society of America.published_or_final_versio
- …