The optimization of a wavelet-based algorithm to improve speech
intelligibility along with the full data set and results are reported. The
discrete-time speech signal is split into frequency sub-bands via a multi-level
discrete wavelet transform. Various gains are applied to the sub-band signals
before they are recombined to form a modified version of the speech. The
sub-band gains are adjusted while keeping the overall signal energy unchanged,
and the speech intelligibility under various background interference and
simulated hearing loss conditions is enhanced and evaluated objectively and
quantitatively using Google Speech-to-Text transcription. A universal set of
sub-band gains can work over a range of noise-to-signal ratios up to 4.8 dB.
For noise-free speech, overall intelligibility is improved, and the Google
transcription accuracy is increased by 16.9 percentage points on average and
86.7 maximum by reallocating the spectral energy toward the mid-frequency
sub-bands. For speech already corrupted by noise, improving intelligibility is
challenging but still realizable with an increased transcription accuracy of
9.5 percentage points on average and 71.4 maximum. The proposed algorithm is
implementable for real-time speech processing and comparatively simpler than
previous algorithms. Potential applications include speech enhancement, hearing
aids, machine listening, and a better understanding of speech intelligibility.Comment: 16 pages, 7 figures, 4 table