31,124 research outputs found
Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
Small CNN-based models usually require transferring knowledge from a large
model before they are deployed in computationally resource-limited edge
devices. Masked image modeling (MIM) methods achieve great success in various
visual tasks but remain largely unexplored in knowledge distillation for
heterogeneous deep models. The reason is mainly due to the significant
discrepancy between the Transformer-based large model and the CNN-based small
network. In this paper, we develop the first Heterogeneous Generative Knowledge
Distillation (H-GKD) based on MIM, which can efficiently transfer knowledge
from large Transformer models to small CNN-based models in a generative
self-supervised fashion. Our method builds a bridge between Transformer-based
models and CNNs by training a UNet-style student with sparse convolution, which
can effectively mimic the visual representation inferred by a teacher over
masked modeling. Our method is a simple yet effective learning paradigm to
learn the visual representation and distribution of data from heterogeneous
teacher models, which can be pre-trained using advanced generative methods.
Extensive experiments show that it adapts well to various models and sizes,
consistently achieving state-of-the-art performance in image classification,
object detection, and semantic segmentation tasks. For example, in the Imagenet
1K dataset, H-GKD improves the accuracy of Resnet50 (sparse) from 76.98% to
80.01%
Convolutional Color Constancy
Color constancy is the problem of inferring the color of the light that
illuminated a scene, usually so that the illumination color can be removed.
Because this problem is underconstrained, it is often solved by modeling the
statistical regularities of the colors of natural objects and illumination. In
contrast, in this paper we reformulate the problem of color constancy as a 2D
spatial localization task in a log-chrominance space, thereby allowing us to
apply techniques from object detection and structured prediction to the color
constancy problem. By directly learning how to discriminate between correctly
white-balanced images and poorly white-balanced images, our model is able to
improve performance on standard benchmarks by nearly 40%
Deep Generative Modeling of LiDAR Data
Building models capable of generating structured output is a key challenge
for AI and robotics. While generative models have been explored on many types
of data, little work has been done on synthesizing lidar scans, which play a
key role in robot mapping and localization. In this work, we show that one can
adapt deep generative models for this task by unravelling lidar scans into a 2D
point map. Our approach can generate high quality samples, while simultaneously
learning a meaningful latent representation of the data. We demonstrate
significant improvements against state-of-the-art point cloud generation
methods. Furthermore, we propose a novel data representation that augments the
2D signal with absolute positional information. We show that this helps
robustness to noisy and imputed input; the learned model can recover the
underlying lidar scan from seemingly uninformative dataComment: Presented at IROS 201
- …