722 research outputs found
Deep Descriptor Transforming for Image Co-Localization
Reusable model design becomes desirable with the rapid expansion of machine
learning applications. In this paper, we focus on the reusability of
pre-trained deep convolutional models. Specifically, different from treating
pre-trained models as feature extractors, we reveal more treasures beneath
convolutional layers, i.e., the convolutional activations could act as a
detector for the common object in the image co-localization problem. We propose
a simple but effective method, named Deep Descriptor Transforming (DDT), for
evaluating the correlations of descriptors and then obtaining the
category-consistent regions, which can accurately locate the common object in a
set of images. Empirical studies validate the effectiveness of the proposed DDT
method. On benchmark image co-localization datasets, DDT consistently
outperforms existing state-of-the-art methods by a large margin. Moreover, DDT
also demonstrates good generalization ability for unseen categories and
robustness for dealing with noisy data.Comment: Accepted by IJCAI 201
Object Discovery From a Single Unlabeled Image by Mining Frequent Itemset With Multi-scale Features
TThe goal of our work is to discover dominant objects in a very general
setting where only a single unlabeled image is given. This is far more
challenge than typical co-localization or weakly-supervised localization tasks.
To tackle this problem, we propose a simple but effective pattern mining-based
method, called Object Location Mining (OLM), which exploits the advantages of
data mining and feature representation of pre-trained convolutional neural
networks (CNNs). Specifically, we first convert the feature maps from a
pre-trained CNN model into a set of transactions, and then discovers frequent
patterns from transaction database through pattern mining techniques. We
observe that those discovered patterns, i.e., co-occurrence highlighted
regions, typically hold appearance and spatial consistency. Motivated by this
observation, we can easily discover and localize possible objects by merging
relevant meaningful patterns. Extensive experiments on a variety of benchmarks
demonstrate that OLM achieves competitive localization performance compared
with the state-of-the-art methods. We also evaluate our approach compared with
unsupervised saliency detection methods and achieves competitive results on
seven benchmark datasets. Moreover, we conduct experiments on fine-grained
classification to show that our proposed method can locate the entire object
and parts accurately, which can benefit to improving the classification results
significantly
Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos
Wearable cameras stand out as one of the most promising devices for the
upcoming years, and as a consequence, the demand of computer algorithms to
automatically understand the videos recorded with them is increasing quickly.
An automatic understanding of these videos is not an easy task, and its mobile
nature implies important challenges to be faced, such as the changing light
conditions and the unrestricted locations recorded. This paper proposes an
unsupervised strategy based on global features and manifold learning to endow
wearable cameras with contextual information regarding the light conditions and
the location captured. Results show that non-linear manifold methods can
capture contextual patterns from global features without compromising large
computational resources. The proposed strategy is used, as an application case,
as a switching mechanism to improve the hand-detection problem in egocentric
videos.Comment: Submitted for publicatio
- …