132 research outputs found

    KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution

    Full text link
    Dynamic convolution learns a linear mixture of nn static kernels weighted with their sample-dependent attentions, demonstrating superior performance compared to normal convolution. However, existing designs are parameter-inefficient: they increase the number of convolutional parameters by nn times. This and the optimization difficulty lead to no research progress in dynamic convolution that can allow us to use a significant large value of nn (e.g., n>100n>100 instead of typical setting n<10n<10) to push forward the performance boundary. In this paper, we propose KernelWarehouseKernelWarehouse, a more general form of dynamic convolution, which can strike a favorable trade-off between parameter efficiency and representation power. Its key idea is to redefine the basic concepts of "kernelskernels" and "assemblingassembling kernelskernels" in dynamic convolution from the perspective of reducing kernel dimension and increasing kernel number significantly. In principle, KernelWarehouse enhances convolutional parameter dependencies within the same layer and across successive layers via tactful kernel partition and warehouse sharing, yielding a high degree of freedom to fit a desired parameter budget. We validate our method on ImageNet and MS-COCO datasets with different ConvNet architectures, and show that it attains state-of-the-art results. For instance, the ResNet18|ResNet50|MobileNetV2|ConvNeXt-Tiny model trained with KernelWarehouse on ImageNet reaches 76.05%|81.05%|75.52%|82.51% top-1 accuracy. Thanks to its flexible design, KernelWarehouse can even reduce the model size of a ConvNet while improving the accuracy, e.g., our ResNet18 model with 36.45%|65.10% parameter reduction to the baseline shows 2.89%|2.29% absolute improvement to top-1 accuracy.Comment: This research work was completed and submitted in early May 2023. Code and pre-trained models are available at https://github.com/OSVAI/KernelWarehous

    NORM: Knowledge Distillation via N-to-One Representation Matching

    Full text link
    Existing feature distillation methods commonly adopt the One-to-one Representation Matching between any pre-selected teacher-student layer pair. In this paper, we present N-to-One Representation (NORM), a new two-stage knowledge distillation method, which relies on a simple Feature Transform (FT) module consisting of two linear layers. In view of preserving the intact information learnt by the teacher network, during training, our FT module is merely inserted after the last convolutional layer of the student network. The first linear layer projects the student representation to a feature space having N times feature channels than the teacher representation from the last convolutional layer, and the second linear layer contracts the expanded output back to the original feature space. By sequentially splitting the expanded student representation into N non-overlapping feature segments having the same number of feature channels as the teacher's, they can be readily forced to approximate the intact teacher representation simultaneously, formulating a novel many-to-one representation matching mechanism conditioned on a single teacher-student layer pair. After training, such an FT module will be naturally merged into the subsequent fully connected layer thanks to its linear property, introducing no extra parameters or architectural modifications to the student network at inference. Extensive experiments on different visual recognition benchmarks demonstrate the leading performance of our method. For instance, the ResNet18|MobileNet|ResNet50-1/4 model trained by NORM reaches 72.14%|74.26%|68.03% top-1 accuracy on the ImageNet dataset when using a pre-trained ResNet34|ResNet50|ResNet50 model as the teacher, achieving an absolute improvement of 2.01%|4.63%|3.03% against the individually trained counterpart. Code is available at https://github.com/OSVAI/NORMComment: The paper of NORM is published at ICLR 2023. Code and models are available at https://github.com/OSVAI/NOR

    Omni-Dimensional Dynamic Convolution

    Full text link
    Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs). Instead, recent research in dynamic convolution shows that learning a linear combination of nn convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs, while maintaining efficient inference. However, we observe that existing works endow convolutional kernels with the dynamic property through one dimension (regarding the convolutional kernel number) of the kernel space, but the other three dimensions (regarding the spatial size, the input channel number and the output channel number for each convolutional kernel) are overlooked. Inspired by this, we present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design, to advance this line of research. ODConv leverages a novel multi-dimensional attention mechanism with a parallel strategy to learn complementary attentions for convolutional kernels along all four dimensions of the kernel space at any convolutional layer. As a drop-in replacement of regular convolutions, ODConv can be plugged into many CNN architectures. Extensive experiments on the ImageNet and MS-COCO datasets show that ODConv brings solid accuracy boosts for various prevailing CNN backbones including both light-weight and large ones, e.g., 3.77%~5.71%|1.86%~3.72% absolute top-1 improvements to MobivleNetV2|ResNet family on the ImageNet dataset. Intriguingly, thanks to its improved feature learning ability, ODConv with even one single kernel can compete with or outperform existing dynamic convolution counterparts with multiple kernels, substantially reducing extra parameters. Furthermore, ODConv is also superior to other attention modules for modulating the output features or the convolutional weights.Comment: Spotlight paper at ICLR 2022. Code and models are available at https://github.com/OSVAI/ODCon

    Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation

    Full text link
    In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes. This method enjoys the benefits of two novel techniques. The first one is semantic transfer (ST), which is: (1) a formulation to integrate the relationship between scene clutter and room layout into convolutional neural networks; (2) an architecture that can be end-to-end trained; (3) a practical strategy to initialize weights for very deep networks under unbalanced training data distribution. ST allows us to extract highly robust features under various circumstances, and in order to address the computation redundance hidden in these features we develop a principled and efficient inference scheme named physics inspired optimization (PIO). PIO's basic idea is to formulate some phenomena observed in ST features into mechanics concepts. Evaluations on public datasets LSUN and Hedau show that the proposed method is more accurate than state-of-the-art methods.Comment: To appear in CVPR 2017. Project Page: https://sites.google.com/view/st-pio

    A Computational Study of Negative Surface Discharges: Characteristics of Surface Streamers and Surface Charges

    Get PDF
    We investigate the dynamics of negative surface discharges in air through numerical simulations with a 2D fluid model. A geometry consisting of a flat dielectric embedded between parallel-plate electrodes is used. Compared to negative streamers in bulk gas, negative surface streamers are observed to have a higher electron density, a higher electric field and higher propagation velocity. On the other hand, their maximum electric field and velocity are lower than for positive surface streamers. In our simulations, negative surface streamers are slower for larger relative permittivity. Negative charge accumulates on a dielectric surface when a negative streamer propagates along it, which can lead to a high electric field inside the dielectric. If we initially put negative surface charge on the dielectric, the growth of negative surface discharges is delayed or inhibited. Positive surface charge has the opposite effect.Comment: 8 page

    A computational study of positive streamers interacting with dielectrics

    Get PDF
    We use numerical simulations to study the dynamics of surface discharges, which are common in high-voltage engineering. We simulate positive streamer discharges that propagate towards a dielectric surface, attach to it, and then propagate over the surface. The simulations are performed in air with a two-dimensional plasma fluid model, in which a flat dielectric is placed between two plate electrodes. Electrostatic attraction is the main mechanism that causes streamers to grow towards the dielectric. Due to the net charge in the streamer head, the dielectric gets polarized, and the electric field between the streamer and the dielectric is increased. Compared to streamers in bulk gas, surface streamers have a smaller radius, a higher electric field, a higher electron density, and higher propagation velocity. A higher applied voltage leads to faster inception and faster propagation of the surface discharge. A higher dielectric permittivity leads to more rapid attachment of the streamer to the surface and a thinner surface streamer. Secondary emission coefficients are shown to play a modest role, which is due to relatively strong photoionization in air. In the simulations, a high electric field is present between the positive streamers and the dielectric surface. We show that the magnitude and decay of this field are affected by the positive ion mobility.Comment: 13 pages, 18 figures, 47 reference
    • …
    corecore