60 research outputs found
Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification
Data-Free Knowledge Distillation (DFKD) has recently attracted growing
attention in the academic community, especially with major breakthroughs in
computer vision. Despite promising results, the technique has not been well
applied to audio and signal processing. Due to the variable duration of audio
signals, it has its own unique way of modeling. In this work, we propose
feature-rich audio model inversion (FRAMI), a data-free knowledge distillation
framework for general sound classification tasks. It first generates
high-quality and feature-rich Mel-spectrograms through a feature-invariant
contrastive loss. Then, the hidden states before and after the statistics
pooling layer are reused when knowledge distillation is performed on these
feature-rich samples. Experimental results on the Urbansound8k, ESC-50, and
audioMNIST datasets demonstrate that FRAMI can generate feature-rich samples.
Meanwhile, the accuracy of the student model is further improved by reusing the
hidden state and significantly outperforms the baseline method.Comment: Accepted by ICASSP 2023. International Conference on Acoustics,
Speech and Signal Processing (ICASSP 2023
Data-Free Backbone Fine-Tuning for Pruned Neural Networks
Model compression techniques reduce the computational load and memory
consumption of deep neural networks. After the compression operation, e.g.
parameter pruning, the model is normally fine-tuned on the original training
dataset to recover from the performance drop caused by compression. However,
the training data is not always available due to privacy issues or other
factors. In this work, we present a data-free fine-tuning approach for pruning
the backbone of deep neural networks. In particular, the pruned network
backbone is trained with synthetically generated images, and our proposed
intermediate supervision to mimic the unpruned backbone's output feature map.
Afterwards, the pruned backbone can be combined with the original network head
to make predictions. We generate synthetic images by back-propagating gradients
to noise images while relying on L1-pruning for the backbone pruning. In our
experiments, we show that our approach is task-independent due to pruning only
the backbone. By evaluating our approach on 2D human pose estimation, object
detection, and image classification, we demonstrate promising performance
compared to the unpruned model. Our code is available at
https://github.com/holzbock/dfbf.Comment: Accpeted for presentation at the 31st European Signal Processing
Conference (EUSIPCO) 2023, September 4-8, 2023, Helsinki, Finlan
Data-Free Neural Architecture Search via Recursive Label Calibration
This paper aims to explore the feasibility of neural architecture search
(NAS) given only a pre-trained model without using any original training data.
This is an important circumstance for privacy protection, bias avoidance, etc.,
in real-world scenarios. To achieve this, we start by synthesizing usable data
through recovering the knowledge from a pre-trained deep neural network. Then
we use the synthesized data and their predicted soft-labels to guide neural
architecture search. We identify that the NAS task requires the synthesized
data (we target at image domain here) with enough semantics, diversity, and a
minimal domain gap from the natural images. For semantics, we propose recursive
label calibration to produce more informative outputs. For diversity, we
propose a regional update strategy to generate more diverse and
semantically-enriched synthetic data. For minimal domain gap, we use input and
feature-level regularization to mimic the original data distribution in latent
space. We instantiate our proposed framework with three popular NAS algorithms:
DARTS, ProxylessNAS and SPOS. Surprisingly, our results demonstrate that the
architectures discovered by searching with our synthetic data achieve accuracy
that is comparable to, or even higher than, architectures discovered by
searching from the original ones, for the first time, deriving the conclusion
that NAS can be done effectively with no need of access to the original or
called natural data if the synthesis method is well designed.Comment: ECCV 202
Perception Imitation: Towards Synthesis-free Simulator for Autonomous Vehicles
We propose a perception imitation method to simulate results of a certain
perception model, and discuss a new heuristic route of autonomous driving
simulator without data synthesis. The motivation is that original sensor data
is not always necessary for tasks such as planning and control when semantic
perception results are ready, so that simulating perception directly is more
economic and efficient. In this work, a series of evaluation methods such as
matching metric and performance of downstream task are exploited to examine the
simulation quality. Experiments show that our method is effective to model the
behavior of learning-based perception model, and can be further applied in the
proposed simulation route smoothly
- …