Search CORE

2 research outputs found

Multi-stream Acoustic Modelling using Raw Real and Imaginary Parts of the Fourier Transform

Author: Bell Peter
Cvetkovic Zoran
Loweimi Erfan
Renals Steve
Yue Zhengjun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/01/2023
Field of study

Statistical Normalisation of Phase-based Feature Representation For Robust Speech Recognition

Author: Barker J.
Hain T.
Loweimi E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2017
Field of study

In earlier work we have proposed a source-filter decomposition of speech through phase-based processing. The decomposition leads to novel speech features that are extracted from the filter component of the phase spectrum. This paper analyses this spectrum and the proposed representation by evaluating statistical properties at various points along the parametrisation pipeline. We show that speech phase spectrum has a bell-shaped distribution which is in contrast to the uniform assumption that is usually made. It is demonstrated that the uniform density (which implies that the corresponding sequence is least-informative) is an artefact of the phase wrapping and not an original characteristic of this spectrum. In addition, we extend the idea of statistical normalisation usually applied for the magnitudebased features into the phase domain. Based on the statistical structure of the phase-based features, which is shown to be super-gaussian in the clean condition, three normalisation schemes, namely, Gaussianisation, Laplacianisation and table-based histogram equalisation have been applied for improving the robustness. Speech recognition experiments using Aurora-2 show that applying an optimal normalisation scheme at the right stage of the feature extraction process can produce average relative WER reductions of up to 18.6% across the 0-20 dB SNR conditions

Crossref

White Rose Research Online