284 research outputs found
Normalized Feature Distillation for Semantic Segmentation
As a promising approach in model compression, knowledge distillation improves
the performance of a compact model by transferring the knowledge from a
cumbersome one. The kind of knowledge used to guide the training of the student
is important. Previous distillation methods in semantic segmentation strive to
extract various forms of knowledge from the features, which involve elaborate
manual design relying on prior information and have limited performance gains.
In this paper, we propose a simple yet effective feature distillation method
called normalized feature distillation (NFD), aiming to enable effective
distillation with the original features without the need to manually design new
forms of knowledge. The key idea is to prevent the student from focusing on
imitating the magnitude of the teacher's feature response by normalization. Our
method achieves state-of-the-art distillation results for semantic segmentation
on Cityscapes, VOC 2012, and ADE20K datasets. Code will be available
CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks
The unprecedented increase in the usage of computer vision technology in
society goes hand in hand with an increased concern in data privacy. In many
real-world scenarios like people tracking or action recognition, it is
important to be able to process the data while taking careful consideration in
protecting people's identity. We propose and develop CIAGAN, a model for image
and video anonymization based on conditional generative adversarial networks.
Our model is able to remove the identifying characteristics of faces and bodies
while producing high-quality images and videos that can be used for any
computer vision task, such as detection or tracking. Unlike previous methods,
we have full control over the de-identification (anonymization) procedure,
ensuring both anonymization as well as diversity. We compare our method to
several baselines and achieve state-of-the-art results.Comment: CVPR 202
Text-Guided Neural Image Inpainting
Image inpainting task requires filling the corrupted image with contents
coherent with the context. This research field has achieved promising progress
by using neural image inpainting methods. Nevertheless, there is still a
critical challenge in guessing the missed content with only the context pixels.
The goal of this paper is to fill the semantic information in corrupted images
according to the provided descriptive text. Unique from existing text-guided
image generation works, the inpainting models are required to compare the
semantic content of the given text and the remaining part of the image, then
find out the semantic content that should be filled for missing part. To
fulfill such a task, we propose a novel inpainting model named Text-Guided Dual
Attention Inpainting Network (TDANet). Firstly, a dual multimodal attention
mechanism is designed to extract the explicit semantic information about the
corrupted regions, which is done by comparing the descriptive text and
complementary image areas through reciprocal attention. Secondly, an image-text
matching loss is applied to maximize the semantic similarity of the generated
image and the text. Experiments are conducted on two open datasets. Results
show that the proposed TDANet model reaches new state-of-the-art on both
quantitative and qualitative measures. Result analysis suggests that the
generated images are consistent with the guidance text, enabling the generation
of various results by providing different descriptions. Codes are available at
https://github.com/idealwhite/TDANetComment: ACM MM'2020 (Oral). 9 pages, 4 tables, 7 figure
Deep Constrained Dominant Sets for Person Re-Identification
In this work, we propose an end-to-end constrained clustering scheme to tackle the person re-identification (re-id) problem. Deep neural networks (DNN) have recently proven to be effective on person re-identification task. In particular, rather than leveraging solely a probe-gallery similarity, diffusing the similarities among the gallery images in an end-to-end manner has proven to be effective in yielding a robust probe-gallery affinity. However, existing methods do not apply probe image as a constraint, and are prone to noise propagation during the similarity diffusion process. To overcome this, we propose an intriguing scheme which treats person-image retrieval problem as a constrained clustering optimization problem, called deep constrained dominant sets (DCDS). Given a probe and gallery images, we re-formulate person re-id problem as finding a constrained cluster, where the probe image is taken as a constraint (seed) and each cluster corresponds to a set of images corresponding to the same person. By optimizing the constrained clustering in an end-to-end manner, we naturally leverage the contextual knowledge of a set of images corresponding to the given person-images. We further enhance the performance by integrating an auxiliary net alongside DCDS, which employs a multi-scale ResNet. To validate the effectiveness of our method we present experiments on several benchmark datasets and show that the proposed method can outperform state-of-the-art methods
- …