140 research outputs found
Learning Dictionary of Discriminative Part Detectors for Image Categorization and Cosegmentation
International audienceThis paper proposes a novel approach to learning mid-level image models for image categorization and cosegmentation. We represent each image class by a dictionary of discriminative part detectors that best discriminate that class from the background. We learn category-specific part detectors in a weakly supervised setting in which the training images are only labeled with category labels without part / object location labels. We use a latent SVM model regularized by l1,2 group sparsity to learn the discriminative part detectors. Starting from a large set of initial parts, the group sparsity regularizer forces the model to jointly select and optimize a set of discriminative part detectors in a max-margin framework. We propose a stochastic version of a proximal algorithm to solve the corresponding optimization problem. We apply the learned part detectors to image classification and cosegmentation, and quantitative experiments with standard benchmarks show that our approach matches or improves upon the state of the art
Learning Discriminative Part Detectors for Image Classification and Cosegmentation
International audienceIn this paper, we address the problem of learning discriminative part detectors from image sets with category labels. We propose a novel latent SVM model regularized by group sparsity to learn these part detectors. Starting from a large set of initial parts, the group sparsity regularizer forces the model to jointly select and optimize a set of discriminative part detectors in a max-margin framework. We propose a stochastic version of a proximal algorithm to solve the corresponding optimization problem. We apply the proposed method to image classification and cosegmentation, and quantitative experiments with standard benchmarks show that it matches or improves upon the state of the art
Unsupervised Object Discovery and Localization in the Wild: Part-based Matching with Bottom-up Region Proposals
This paper addresses unsupervised discovery and localization of dominant
objects from a noisy image collection with multiple object classes. The setting
of this problem is fully unsupervised, without even image-level annotations or
any assumption of a single dominant class. This is far more general than
typical colocalization, cosegmentation, or weakly-supervised localization
tasks. We tackle the discovery and localization problem using a part-based
region matching approach: We use off-the-shelf region proposals to form a set
of candidate bounding boxes for objects and object parts. These regions are
efficiently matched across images using a probabilistic Hough transform that
evaluates the confidence for each candidate correspondence considering both
appearance and spatial consistency. Dominant objects are discovered and
localized by comparing the scores of candidate regions and selecting those that
stand out over other regions containing them. Extensive experimental
evaluations on standard benchmarks demonstrate that the proposed approach
significantly outperforms the current state of the art in colocalization, and
achieves robust object discovery in challenging mixed-class datasets.Comment: CVPR 201
Mid-level Deep Pattern Mining
Mid-level visual element discovery aims to find clusters of image patches
that are both representative and discriminative. In this work, we study this
problem from the prospective of pattern mining while relying on the recently
popularized Convolutional Neural Networks (CNNs). Specifically, we find that
for an image patch, activations extracted from the first fully-connected layer
of CNNs have two appealing properties which enable its seamless integration
with pattern mining. Patterns are then discovered from a large number of CNN
activations of image patches through the well-known association rule mining.
When we retrieve and visualize image patches with the same pattern,
surprisingly, they are not only visually similar but also semantically
consistent. We apply our approach to scene and object classification tasks, and
demonstrate that our approach outperforms all previous works on mid-level
visual element discovery by a sizeable margin with far fewer elements being
used. Our approach also outperforms or matches recent works using CNN for these
tasks. Source code of the complete system is available online.Comment: Published in Proc. IEEE Conf. Computer Vision and Pattern Recognition
201
- …