Pitch estimation has a central role in many speech processing applications. In voiced speech, pitch can be objectively defined as the rate of vibration of the vocal folds. However, pitch is an inherently subjective quantity and cannot be directly measured from the speech signal. It is a nonlinear function of the signal’s spectral and temporal energy distribution. A number of methods for pitch estimation have been developed but none can claim to work accurately in the presence of high levels of additive noise or reverberation. Any system of practical importance must be robust to additive noise and reverberation as these are encountered frequently in the field of operation of voice telecommunications systems. In non-intrusive speech quality measurement algorithms, such as the P.563 and LCQA, pitch is used as a feature for quality assessment. The accuracy of this feature in noisy speech signals will be shown to correlate with the accuracy of the objective measure of the quality of the speech signal. In this paper we evaluate the performance of four established state-of-the-art algorithms for pitch estimation in additive noise and reverberation. Furthermore, we show how accurate estimation of the pitch of a speech signal can influence objective speech quality measurement algorithms. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.