224 research outputs found

    Ensembles of Novel Visual Keywords Descriptors for Image Categorization

    Get PDF
    Object recognition systems need effective image descriptors to obtain good performance levels. Currently, the most widely used image descriptor is the SIFT descriptor that computes histograms of orientation gradients around points in an image. A possible problem of this approach is that the number of features becomes very large when a dense grid is used where the histograms are computed and combined for many different points. The current dominating solution to this problem is to use a clustering method to create a visual codebook that is exploited by an appearance based descriptor to create a histogram of visual keywords present in an image. In this paper we introduce several novel bag of visual keywords methods and compare them with the currently dominating hard bag-of-features (HBOF) approach that uses a hard assignment scheme to compute cluster frequencies. Furthermore, we combine all descriptors with a spatial pyramid and two ensemble classifiers. Experimental results on 10 and 101 classes of the Caltech-101 object database show that our novel methods significantly outperform the traditional HBOF approach and that our ensemble methods obtain state-of-the-art performance levels

    AdaBoost 방법을 통해 학습된 SVM 분류기를 이용한 영상 분류

    Get PDF
    학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 유석인.This thesis presents the algorithm that categorizes images by objects contained in the images. The images are encoded with bag-of-features (BoF) model which represents an image as a collection of unordered features extracted from the local patches. To deal with the classification of multiple object categories, the one-versus-all method is applied for the implementation of multi-class classifier. The object classifiers are built as the number of object categories, and each classifier decides whether an image is included in the object category or not. The object classifier has been developed on the AdaBoost method. The object classifier is given by the weighted sum of 200 support vector machine (SVM) component classifiers. Among multiple object classifiers, the classifier with the highest output function value finally determines the category of the object image. The classification efficiency of the presented algorithm has been illustrated on the images from Caltech-101 dataset.Abstract i Contents iii List of Figures v List of Tables vi Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Image classification approaches . . . . . . . . . . . 3 2.2 Boosting methods . . . . . . . . . . . . . . . 6 2.3 Background . . . . . . . . . . . . . . . . . 9 2.3.1 Support vector machine . . . . . . . . . . . . . 9 Chapter 3 Proposed Algorithm 12 3.1 SIFT feature extraction . . . . . . . . . . . . . 13 3.2 Codebook construction . . . . . . . . . . . . . 15 3.3 Bag-of-features representation . . . . . . . . . . . 16 3.4 Classifier design . . . . . . . . . . . . . . . 16 Chapter 4 Experiments 20 4.1 Dataset . . . . . . . . . . . . . . . . . . 20 4.2 Bag-of-features representation . . . . . . . . . . . 22 4.3 Classifiers . . . . . . . . . . . . . . . . . 24 4.4 Classification results . . . . . . . . . . . . . . 25 Chapter 5 Conclusion 29 Bibliography 30 Abstract in Korean 34Maste

    Insights from Classifying Visual Concepts with Multiple Kernel Learning

    Get PDF
    Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, so-called 1-norm MKL variants are often observed to be outperformed by an unweighted sum kernel. The contribution of this paper is twofold: We apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks within computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum kernel SVM and the sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. About to be submitted to PLoS ONE.Comment: 18 pages, 8 tables, 4 figures, format deviating from plos one submission format requirements for aesthetic reason

    Image classification by visual bag-of-words refinement and reduction

    Full text link
    This paper presents a new framework for visual bag-of-words (BOW) refinement and reduction to overcome the drawbacks associated with the visual BOW model which has been widely used for image classification. Although very influential in the literature, the traditional visual BOW model has two distinct drawbacks. Firstly, for efficiency purposes, the visual vocabulary is commonly constructed by directly clustering the low-level visual feature vectors extracted from local keypoints, without considering the high-level semantics of images. That is, the visual BOW model still suffers from the semantic gap, and thus may lead to significant performance degradation in more challenging tasks (e.g. social image classification). Secondly, typically thousands of visual words are generated to obtain better performance on a relatively large image dataset. Due to such large vocabulary size, the subsequent image classification may take sheer amount of time. To overcome the first drawback, we develop a graph-based method for visual BOW refinement by exploiting the tags (easy to access although noisy) of social images. More notably, for efficient image classification, we further reduce the refined visual BOW model to a much smaller size through semantic spectral clustering. Extensive experimental results show the promising performance of the proposed framework for visual BOW refinement and reduction
    corecore