Deconvolution of the speech excitation (source) and vocal tract
(filter) components through log-magnitude spectral processing
is well-established and has led to the well-known cepstral features
used in a multitude of speech processing tasks. This paper
presents a novel source-filter decomposition based on processing
in the phase domain. We show that separation between
source and filter in the log-magnitude spectra is far from
perfect, leading to loss of vital vocal tract information. It is
demonstrated that the same task can be better performed by
trend and fluctuation analysis of the phase spectrum of the
minimum-phase component of speech, which can be computed
via the Hilbert transform. Trend and fluctuation can be separated
through low-pass filtering of the phase, using additivity of
vocal tract and source in the phase domain. This results in separated
signals which have a clear relation to the vocal tract and
excitation components. The effectiveness of the method is put
to test in a speech recognition task. The vocal tract component
extracted in this way is used as the basis of a feature extraction
algorithm for speech recognition on the Aurora-2 database.
The recognition results shows upto 8.5% absolute improvement
in comparison with MFCC features on average (0-20dB)