132 research outputs found
KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution
Dynamic convolution learns a linear mixture of static kernels weighted
with their sample-dependent attentions, demonstrating superior performance
compared to normal convolution. However, existing designs are
parameter-inefficient: they increase the number of convolutional parameters by
times. This and the optimization difficulty lead to no research progress in
dynamic convolution that can allow us to use a significant large value of
(e.g., instead of typical setting ) to push forward the
performance boundary. In this paper, we propose , a more
general form of dynamic convolution, which can strike a favorable trade-off
between parameter efficiency and representation power. Its key idea is to
redefine the basic concepts of "" and " " in
dynamic convolution from the perspective of reducing kernel dimension and
increasing kernel number significantly. In principle, KernelWarehouse enhances
convolutional parameter dependencies within the same layer and across
successive layers via tactful kernel partition and warehouse sharing, yielding
a high degree of freedom to fit a desired parameter budget. We validate our
method on ImageNet and MS-COCO datasets with different ConvNet architectures,
and show that it attains state-of-the-art results. For instance, the
ResNet18|ResNet50|MobileNetV2|ConvNeXt-Tiny model trained with KernelWarehouse
on ImageNet reaches 76.05%|81.05%|75.52%|82.51% top-1 accuracy. Thanks to its
flexible design, KernelWarehouse can even reduce the model size of a ConvNet
while improving the accuracy, e.g., our ResNet18 model with 36.45%|65.10%
parameter reduction to the baseline shows 2.89%|2.29% absolute improvement to
top-1 accuracy.Comment: This research work was completed and submitted in early May 2023.
Code and pre-trained models are available at
https://github.com/OSVAI/KernelWarehous
NORM: Knowledge Distillation via N-to-One Representation Matching
Existing feature distillation methods commonly adopt the One-to-one
Representation Matching between any pre-selected teacher-student layer pair. In
this paper, we present N-to-One Representation (NORM), a new two-stage
knowledge distillation method, which relies on a simple Feature Transform (FT)
module consisting of two linear layers. In view of preserving the intact
information learnt by the teacher network, during training, our FT module is
merely inserted after the last convolutional layer of the student network. The
first linear layer projects the student representation to a feature space
having N times feature channels than the teacher representation from the last
convolutional layer, and the second linear layer contracts the expanded output
back to the original feature space. By sequentially splitting the expanded
student representation into N non-overlapping feature segments having the same
number of feature channels as the teacher's, they can be readily forced to
approximate the intact teacher representation simultaneously, formulating a
novel many-to-one representation matching mechanism conditioned on a single
teacher-student layer pair. After training, such an FT module will be naturally
merged into the subsequent fully connected layer thanks to its linear property,
introducing no extra parameters or architectural modifications to the student
network at inference. Extensive experiments on different visual recognition
benchmarks demonstrate the leading performance of our method. For instance, the
ResNet18|MobileNet|ResNet50-1/4 model trained by NORM reaches
72.14%|74.26%|68.03% top-1 accuracy on the ImageNet dataset when using a
pre-trained ResNet34|ResNet50|ResNet50 model as the teacher, achieving an
absolute improvement of 2.01%|4.63%|3.03% against the individually trained
counterpart. Code is available at https://github.com/OSVAI/NORMComment: The paper of NORM is published at ICLR 2023. Code and models are
available at https://github.com/OSVAI/NOR
Omni-Dimensional Dynamic Convolution
Learning a single static convolutional kernel in each convolutional layer is
the common training paradigm of modern Convolutional Neural Networks (CNNs).
Instead, recent research in dynamic convolution shows that learning a linear
combination of convolutional kernels weighted with their input-dependent
attentions can significantly improve the accuracy of light-weight CNNs, while
maintaining efficient inference. However, we observe that existing works endow
convolutional kernels with the dynamic property through one dimension
(regarding the convolutional kernel number) of the kernel space, but the other
three dimensions (regarding the spatial size, the input channel number and the
output channel number for each convolutional kernel) are overlooked. Inspired
by this, we present Omni-dimensional Dynamic Convolution (ODConv), a more
generalized yet elegant dynamic convolution design, to advance this line of
research. ODConv leverages a novel multi-dimensional attention mechanism with a
parallel strategy to learn complementary attentions for convolutional kernels
along all four dimensions of the kernel space at any convolutional layer. As a
drop-in replacement of regular convolutions, ODConv can be plugged into many
CNN architectures. Extensive experiments on the ImageNet and MS-COCO datasets
show that ODConv brings solid accuracy boosts for various prevailing CNN
backbones including both light-weight and large ones, e.g.,
3.77%~5.71%|1.86%~3.72% absolute top-1 improvements to MobivleNetV2|ResNet
family on the ImageNet dataset. Intriguingly, thanks to its improved feature
learning ability, ODConv with even one single kernel can compete with or
outperform existing dynamic convolution counterparts with multiple kernels,
substantially reducing extra parameters. Furthermore, ODConv is also superior
to other attention modules for modulating the output features or the
convolutional weights.Comment: Spotlight paper at ICLR 2022. Code and models are available at
https://github.com/OSVAI/ODCon
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation
In this paper, we propose an alternative method to estimate room layouts of
cluttered indoor scenes. This method enjoys the benefits of two novel
techniques. The first one is semantic transfer (ST), which is: (1) a
formulation to integrate the relationship between scene clutter and room layout
into convolutional neural networks; (2) an architecture that can be end-to-end
trained; (3) a practical strategy to initialize weights for very deep networks
under unbalanced training data distribution. ST allows us to extract highly
robust features under various circumstances, and in order to address the
computation redundance hidden in these features we develop a principled and
efficient inference scheme named physics inspired optimization (PIO). PIO's
basic idea is to formulate some phenomena observed in ST features into
mechanics concepts. Evaluations on public datasets LSUN and Hedau show that the
proposed method is more accurate than state-of-the-art methods.Comment: To appear in CVPR 2017. Project Page:
https://sites.google.com/view/st-pio
A Computational Study of Negative Surface Discharges: Characteristics of Surface Streamers and Surface Charges
We investigate the dynamics of negative surface discharges in air through
numerical simulations with a 2D fluid model. A geometry consisting of a flat
dielectric embedded between parallel-plate electrodes is used. Compared to
negative streamers in bulk gas, negative surface streamers are observed to have
a higher electron density, a higher electric field and higher propagation
velocity. On the other hand, their maximum electric field and velocity are
lower than for positive surface streamers. In our simulations, negative surface
streamers are slower for larger relative permittivity. Negative charge
accumulates on a dielectric surface when a negative streamer propagates along
it, which can lead to a high electric field inside the dielectric. If we
initially put negative surface charge on the dielectric, the growth of negative
surface discharges is delayed or inhibited. Positive surface charge has the
opposite effect.Comment: 8 page
A computational study of positive streamers interacting with dielectrics
We use numerical simulations to study the dynamics of surface discharges,
which are common in high-voltage engineering. We simulate positive streamer
discharges that propagate towards a dielectric surface, attach to it, and then
propagate over the surface. The simulations are performed in air with a
two-dimensional plasma fluid model, in which a flat dielectric is placed
between two plate electrodes. Electrostatic attraction is the main mechanism
that causes streamers to grow towards the dielectric. Due to the net charge in
the streamer head, the dielectric gets polarized, and the electric field
between the streamer and the dielectric is increased. Compared to streamers in
bulk gas, surface streamers have a smaller radius, a higher electric field, a
higher electron density, and higher propagation velocity. A higher applied
voltage leads to faster inception and faster propagation of the surface
discharge. A higher dielectric permittivity leads to more rapid attachment of
the streamer to the surface and a thinner surface streamer. Secondary emission
coefficients are shown to play a modest role, which is due to relatively strong
photoionization in air. In the simulations, a high electric field is present
between the positive streamers and the dielectric surface. We show that the
magnitude and decay of this field are affected by the positive ion mobility.Comment: 13 pages, 18 figures, 47 reference
- …