6 research outputs found

    Adaptive framing based similarity measurement between time warped speech signals using Kalman filter

    Get PDF
    Similarity measurement between speech signals aims at calculating the degree of similarity using acoustic features that has been receiving much interest due to the processing of large volume of multimedia information. However, dynamic properties of speech signals such as varying silence segments and time warping factor make it more challenging to measure the similarity between speech signals. This manuscript entails further extension of our research towards the adaptive framing based similarity measurement between speech signals using a Kalman filter. Silence removal is enhanced by integrating multiple features for voiced and unvoiced speech segments detection. The adaptive frame size measurement is improved by using the acceleration/deceleration phenomenon of object linear motion. A dominate feature set is used to represent the speech signals along with the pre-calculated model parameters that are set by the offline tuning of a Kalman filter. Performance is evaluated using additional datasets to evaluate the impact of the proposed model and silence removal approach on the time warped speech similarity measurement. Detailed statistical results are achieved indicating the overall accuracy improvement from 91 to 98% that proves the superiority of the extended approach on our previous research work towards the time warped continuous speech similarity measurement

    Time warped continuous speech signal matching using Kalman filter

    No full text
    NoDynamic speech properties, such as time warping, silence removal and background noise reduction are the most challenging issues in continuous speech signal matching. Among all of them, the time warped speech signal matching is of great interest and has been a tough challenge for the researchers. The literature contains a variety of techniques to measure the similarity between speech utterances, however there are some limitations associated with these techniques. This paper introduces an adaptive framing based continuous speech tracking and similarity measurement approach that uses a Kalman filter (KF) as a robust tracker. The use of KF is novel for time warped speech signal matching and dynamic time warping. A dynamic state model is presented based on equations of linear motion. In this model, fixed length frame of input (test) speech signal is considered as a unidirectional moving object by sliding it along the template speech signal. The best matched position estimate in template speech (sample number) for corresponding test frame at current time is calculated. Simultaneously, another position observation is produced by a feature based distance metric. The position estimated by the state model is fused with the observation using KF along with the noise variances. The best estimated frame position in the template speech for the current state is calculated. Finally, forecasting of the noise variances and template frame size for next state are made according to the KF output. The experimental results demonstrate the robustness of the proposed technique in terms of time warped speech signal matching as well as in computation cost
    corecore