1,878 research outputs found
Glottal-synchronous speech processing
Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity
of voiced speech is exploited. Traditionally, speech processing involves segmenting
and processing short speech frames of predefined length; this may fail to exploit the inherent
periodic structure of voiced speech which glottal-synchronous speech frames have
the potential to harness. Glottal-synchronous frames are often derived from the glottal
closure instants (GCIs) and glottal opening instants (GOIs).
The SIGMA algorithm was developed for the detection of GCIs and GOIs from
the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and
GOI detection from speech signals, the YAGA algorithm provides a measured accuracy
of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to
reverberation than single-channel algorithms.
The GCIs are applied to real-world applications including speech dereverberation,
where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance
of voicing detection in glottal-synchronous algorithms is demonstrated by subjective
testing. The GCIs are further exploited in a new area of data-driven speech modelling,
providing new insights into speech production and a set of tools to aid deployment into
real-world applications. The technique is shown to be applicable in areas of speech coding,
identification and artificial bandwidth extension of telephone speec
A new method to explore the spectral impact of the piriform fossae on the singing voice : Benchmarking using MRI-based 3D-printed vocal tracts
The piriform fossae are the 2 pear-shaped cavities lateral to the laryngeal vestibule at the lower end of the vocal tract. They act acoustically as side-branches to the main tract, resulting in a spectral zero in the output of the human voice. This study investigates their spectral role by comparing numerical and experimental results of MRI-based 3D printed Vocal Tracts, for which a new experimental method (based on room acoustics) is introduced. The findings support results in the literature: the piriform fossae create a spectral trough in the region 4–5 kHz and act as formants repellents. Moreover, this study extends those results by demonstrating numerically and perceptually the impact of having large piriform fossae on the sung output
- …