Search CORE

1 research outputs found

A TWO PHASE METHOD FOR GENERAL AUDIO SEGMENTATION

Author: Jacqueline Whalley
Jessie Xin Zhang
Stephen Brooks
Publication venue
Publication date: 20/03/2010
Field of study

This paper presents a model-free and training-free twophase method for audio segmentation that separates monophonic heterogeneous audio files into acoustically homogeneous regions where each region contains a single sound. A rough segmentation separates audio input into audio clips based on silence detection in the time domain. Then a self-similarity matrix, based on selected audio features in the frequency domain to discover the level of similarity between frames in the audio clip, is calculated. Subsequently an edge detection method is used to find regions in the similarity image that determine plausible sounds in the audio clip. The results of the two phases are combined to form the final boundaries for the input audio. This two-phase method is evaluated using established methods and a standard non-musical database. The method reported here offers more accurate segmentation results than existing methods for audio segmentation. We propose that this approach could be adapted as an efficient preprocessing stage in other audio processing systems such as audio retrieval, classification, music analysis and summarization. Index Terms — Audio segmentation, similarity map, edge detectio

CiteSeerX