Search CORE

1 research outputs found

Tv commercial classification by using multi-modal textual information

Author: Jesse S. Jin
Lingyu Duan
Qi Tian
Yantao Zheng
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, we propose an approach for TV commercial video classification by the categories of advertised products or services (e.g. automobiles, healthcare products, etc). Since automatic speech recognition (ASR) and optical character recognition (OCR) can deliver meaningful textual information related to products or services, TV commercial video classification is formulated as the problem of text categorization. However, there exist two challenges. Firstly, the background music of TV commercials makes ASR techniques yield erroneous and deficient output transcripts. Secondly, even if ASR and OCR could work perfectly, the limited textual information from TV commercials do not suffice to train a generic and non-overfitting text categorizer. For the first issue, our approach resorts to the external resources to expand deficient ASR and OCR transcripts. The output transcripts of ASR and OCR are parsed to yield a few keywords, on which a Web searching is executed to retrieve relevant and semantically informative articles from World Wide Web (WWW). The retrieved articles are then utilized to construct textual feature vectors and perform text categorization on behalf of commercials. For the second issue, a topic-wise document corpus is constructed from the public corpora like Reuters-21578 or from the articles manually collected from WWW for the training of text categorizers. Experimental results have shown that the proposed approach alleviates the negative effects from weak ASR/OCR performance and yield a promising classification accuracy of 80.9%. 1

CiteSeerX

Crossref