Speech endpoint detection continues to be a challenging problem particularly for speech recognition in noisy environments. In this paper, we address this problem from the point of view of fractals and chaos. By studying recurrence time statistics for chaotic systems, we find the nonstationarity and transience in a time series are due to non-recurrence and lack of fractal structure in the signal. A Poincaré recurrence metric is designed to determine the stationarity change for endpoint detection. We consider the small area of beginning and ending of an utterance as transient. For nonstationary and transient time series, we expect the average number of Poincaré recurrence points for each given small block will be different for different blocks of data subsets. However, the average number of recurrence points will stay nearly constant. The resulting recurrence point variability algorithm is shown to be well suited for the detection of state transitions in a time series and is very robust for different types of noise, especially for low SNR. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.