Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold

Abstract

Traditionally voice activity detection algorithms are based on any combination of general speech properties such as temporal energy variations, periodicity, and spectrum. This paper describes a novel statistical method for voice activity detection using a signal to noise ratio measure. The method employs a low-variance spectrum estimate and determines an optimal threshold based on the estimated noise statistics. A possible implementation is presented and evaluated over a large test set and compared to current modern standardized algorithms. The evaluations indicate promising results with the proposed scheme being comparable or favorable over the whole test set

    Similar works

    Full text

    thumbnail-image

    Available Versions