23 research outputs found
MAT: A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks
Some recent works revealed that deep neural networks (DNNs) are vulnerable to
so-called adversarial attacks where input examples are intentionally perturbed
to fool DNNs. In this work, we revisit the DNN training process that includes
adversarial examples into the training dataset so as to improve DNN's
resilience to adversarial attacks, namely, adversarial training. Our
experiments show that different adversarial strengths, i.e., perturbation
levels of adversarial examples, have different working zones to resist the
attack. Based on the observation, we propose a multi-strength adversarial
training method (MAT) that combines the adversarial training examples with
different adversarial strengths to defend adversarial attacks. Two training
structures - mixed MAT and parallel MAT - are developed to facilitate the
tradeoffs between training time and memory occupation. Our results show that
MAT can substantially minimize the accuracy degradation of deep learning
systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.Comment: 6 pages, 4 figures, 2 table
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
The complicated architecture and high training cost of vision transformers
urge the exploration of post-training quantization. However, the heavy-tailed
distribution of vision transformer activations hinders the effectiveness of
previous post-training quantization methods, even with advanced quantizer
designs. Instead of tuning the quantizer to better fit the complicated
activation distribution, this paper proposes NoisyQuant, a quantizer-agnostic
enhancement for the post-training activation quantization performance of vision
transformers. We make a surprising theoretical discovery that for a given
quantizer, adding a fixed Uniform noisy bias to the values being quantized can
significantly reduce the quantization error under provable conditions. Building
on the theoretical insight, NoisyQuant achieves the first success on actively
altering the heavy-tailed activation distribution with additive noisy bias to
fit a given quantizer. Extensive experiments show NoisyQuant largely improves
the post-training quantization performance of vision transformer with minimal
computation overhead. For instance, on linear uniform 6-bit activation
quantization, NoisyQuant improves SOTA top-1 accuracy on ImageNet by up to
1.7%, 1.1% and 0.5% for ViT, DeiT, and Swin Transformer respectively, achieving
on-par or even higher performance than previous nonlinear, mixed-precision
quantization.Comment: Accepted to CVPR202
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning
Large Language Models (LLMs) have demonstrated significant potential in
performing multiple tasks in multimedia applications, ranging from content
generation to interactive entertainment, and artistic creation. However, the
diversity of downstream tasks in multitask scenarios presents substantial
adaptation challenges for LLMs. While traditional methods often succumb to
knowledge confusion on their monolithic dense models, Mixture-of-Experts (MoE)
has been emerged as a promising solution with its sparse architecture for
effective task decoupling. Inspired by the principles of human cognitive
neuroscience, we design a novel framework \texttt{Intuition-MoR1E} that
leverages the inherent semantic clustering of instances to mimic the human
brain to deal with multitask, offering implicit guidance to router for
optimized feature allocation. Moreover, we introduce cutting-edge Rank-1
Experts formulation designed to manage a spectrum of intuitions, demonstrating
enhanced parameter efficiency and effectiveness in multitask LLM finetuning.
Extensive experiments demonstrate that Intuition-MoR1E achieves superior
efficiency and 2.15\% overall accuracy improvement across 14 public datasets
against other state-of-the-art baselines.Comment: 13 pages, 5 figure