2,044 research outputs found
Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
Discriminative localization is essential for fine-grained image
classification task, which devotes to recognizing hundreds of subcategories in
the same basic-level category. Reflecting on discriminative regions of objects,
key differences among different subcategories are subtle and local. Existing
methods generally adopt a two-stage learning framework: The first stage is to
localize the discriminative regions of objects, and the second is to encode the
discriminative features for training classifiers. However, these methods
generally have two limitations: (1) Separation of the two-stage learning is
time-consuming. (2) Dependence on object and parts annotations for
discriminative localization learning leads to heavily labor-consuming labeling.
It is highly challenging to address these two important limitations
simultaneously. Existing methods only focus on one of them. Therefore, this
paper proposes the discriminative localization approach via saliency-guided
Faster R-CNN to address the above two limitations at the same time, and our
main novelties and advantages are: (1) End-to-end network based on Faster R-CNN
is designed to simultaneously localize discriminative regions and encode
discriminative features, which accelerates classification speed. (2)
Saliency-guided localization learning is proposed to localize the
discriminative region automatically, avoiding labor-consuming labeling. Both
are jointly employed to simultaneously accelerate classification speed and
eliminate dependence on object and parts annotations. Comparing with the
state-of-the-art methods on the widely-used CUB-200-2011 dataset, our approach
achieves both the best classification accuracy and efficiency.Comment: 9 pages, to appear in ACM MM 201
Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization
Image cartoonization is recently dominated by generative adversarial networks
(GANs) from the perspective of unsupervised image-to-image translation, in
which an inherent challenge is to precisely capture and sufficiently transfer
characteristic cartoon styles (e.g., clear edges, smooth color shading,
abstract fine structures, etc.). Existing advanced models try to enhance
cartoonization effect by learning to promote edges adversarially, introducing
style transfer loss, or learning to align style from multiple representation
space. This paper demonstrates that more distinct and vivid cartoonization
effect could be easily achieved with only basic adversarial loss. Observing
that cartoon style is more evident in cartoon-texture-salient local image
regions, we build a region-level adversarial learning branch in parallel with
the normal image-level one, which constrains adversarial learning on
cartoon-texture-salient local patches for better perceiving and transferring
cartoon texture features. To this end, a novel cartoon-texture-saliency-sampler
(CTSS) module is proposed to dynamically sample cartoon-texture-salient patches
from training data. With extensive experiments, we demonstrate that texture
saliency adaptive attention in adversarial learning, as a missing ingredient of
related methods in image cartoonization, is of significant importance in
facilitating and enhancing image cartoon stylization, especially for
high-resolution input pictures.Comment: Proceedings of the 39th International Conference on Machine Learning,
PMLR 162:7183-7207, 202
Behaviourally meaningful representations from normalisation and context-guided denoising
Many existing independent component analysis algorithms include a preprocessing stage where the inputs are sphered. This amounts to normalising the data such that all correlations between the variables are removed. In this work, I show that sphering allows very weak contextual modulation to steer the development of meaningful features. Context-biased competition has been proposed as a model of covert attention and I propose that sphering-like normalisation also allows weaker top-down bias to guide attention
- …