research

Audio-Visual VQ Shot Clustering for Video Programs

Abstract

Many post-production video documents such as movies, sitcoms and cartoons present well structured story-lines organized in separated audio-visual scenes. Accurate grouping of shots into these logical video segments could lead to semantic indexing of scenes and events for interactive multimedia retrieval. In this paper we introduce a novel shot based analysis approach which aims to cluster together shots with similar audio-visual content. We demonstrate how the use of codebooks of audio and visual codewords (generated by a vector quantization process) results to be an effective method to represent clusters containing shots with similar long-term consistency of chromatic compositions and audio. The output clusters obtained by a simple single-link clustering algorithm, allow the further application of the well-known scene transition graph framework for scene change detection and shot-pattern investigation. In the end the merging of audio and visual results leads to a hierarchical description of the whole video document, useful for multimedia retrieval and summarization purposes

    Similar works