Image classification by visual bag-of-words refinement and reduction

Lu, Zhiwu; Wang, Liwei; Wen, Ji-Rong

research

Image classification by visual bag-of-words refinement and reduction

Authors: Zhiwu Lu
Liwei Wang
Ji-Rong Wen
Publication date: 18 January 2015
Publisher: 'Elsevier BV'
Doi

Abstract

This paper presents a new framework for visual bag-of-words (BOW) refinement and reduction to overcome the drawbacks associated with the visual BOW model which has been widely used for image classification. Although very influential in the literature, the traditional visual BOW model has two distinct drawbacks. Firstly, for efficiency purposes, the visual vocabulary is commonly constructed by directly clustering the low-level visual feature vectors extracted from local keypoints, without considering the high-level semantics of images. That is, the visual BOW model still suffers from the semantic gap, and thus may lead to significant performance degradation in more challenging tasks (e.g. social image classification). Secondly, typically thousands of visual words are generated to obtain better performance on a relatively large image dataset. Due to such large vocabulary size, the subsequent image classification may take sheer amount of time. To overcome the first drawback, we develop a graph-based method for visual BOW refinement by exploiting the tags (easy to access although noisy) of social images. More notably, for efficient image classification, we further reduce the refined visual BOW model to a much smaller size through semantic spectral clustering. Extensive experimental results show the promising performance of the proposed framework for visual BOW refinement and reduction

Similar works

Full text

Available Versions

Institutional Repository of Peking University

oai:localhost:20.500.11897/438...

Last time updated on 20/04/2018