Revisiting Random Weight Perturbation for Efficiently Improving
  Generalization

Fang, Kun; He, Mingzhen; Huang, Xiaolin; Lei, Zehao; Li, Tao; Tao, Qinghua; Wu, Yingwen; Yan, Weihao

Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Authors: Kun Fang
Mingzhen He
Xiaolin Huang
Zehao Lei
Tao Li
Qinghua Tao
Yingwen Wu
Weihao Yan
Publication date: 30 March 2024
Publisher

Abstract

Improving the generalization ability of modern deep neural networks (DNNs) is a fundamental challenge in machine learning. Two branches of methods have been proposed to seek flat minima and improve generalization: one led by sharpness-aware minimization (SAM) minimizes the worst-case neighborhood loss through adversarial weight perturbation (AWP), and the other minimizes the expected Bayes objective with random weight perturbation (RWP). While RWP offers advantages in computation and is closely linked to AWP on a mathematical basis, its empirical performance has consistently lagged behind that of AWP. In this paper, we revisit the use of RWP for improving generalization and propose improvements from two perspectives: i) the trade-off between generalization and convergence and ii) the random perturbation generation. Through extensive experimental evaluations, we demonstrate that our enhanced RWP methods achieve greater efficiency in enhancing generalization, particularly in large-scale problems, while also offering comparable or even superior performance to SAM. The code is released at https://github.com/nblt/mARWP.Comment: Accepted to TMLR 202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2404.00357

Last time updated on 22/10/2024