6,908 research outputs found
Convergence of Unregularized Online Learning Algorithms
In this paper we study the convergence of online gradient descent algorithms
in reproducing kernel Hilbert spaces (RKHSs) without regularization. We
establish a sufficient condition and a necessary condition for the convergence
of excess generalization errors in expectation. A sufficient condition for the
almost sure convergence is also given. With high probability, we provide
explicit convergence rates of the excess generalization errors for both
averaged iterates and the last iterate, which in turn also imply convergence
rates with probability one. To our best knowledge, this is the first
high-probability convergence rate for the last iterate of online gradient
descent algorithms without strong convexity. Without any boundedness
assumptions on iterates, our results are derived by a novel use of two measures
of the algorithm's one-step progress, respectively by generalization errors and
by distances in RKHSs, where the variances of the involved martingales are
cancelled out by the descent property of the algorithm
DomainDrop: Suppressing Domain-Sensitive Channels for Domain Generalization
Deep Neural Networks have exhibited considerable success in various visual
tasks. However, when applied to unseen test datasets, state-of-the-art models
often suffer performance degradation due to domain shifts. In this paper, we
introduce a novel approach for domain generalization from a novel perspective
of enhancing the robustness of channels in feature maps to domain shifts. We
observe that models trained on source domains contain a substantial number of
channels that exhibit unstable activations across different domains, which are
inclined to capture domain-specific features and behave abnormally when exposed
to unseen target domains. To address the issue, we propose a DomainDrop
framework to continuously enhance the channel robustness to domain shifts,
where a domain discriminator is used to identify and drop unstable channels in
feature maps of each network layer during forward propagation. We theoretically
prove that our framework could effectively lower the generalization bound.
Extensive experiments on several benchmarks indicate that our framework
achieves state-of-the-art performance compared to other competing methods. Our
code is available at https://github.com/lingeringlight/DomainDrop.Comment: Accepted by ICCV2023. The code is available at
https://github.com/lingeringlight/DomainDro
ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
Domain generalization (DG) aims to learn a model that generalizes well to
unseen target domains utilizing multiple source domains without re-training.
Most existing DG works are based on convolutional neural networks (CNNs).
However, the local operation of the convolution kernel makes the model focus
too much on local representations (e.g., texture), which inherently causes the
model more prone to overfit to the source domains and hampers its
generalization ability. Recently, several MLP-based methods have achieved
promising results in supervised learning tasks by learning global interactions
among different patches of the image. Inspired by this, in this paper, we first
analyze the difference between CNN and MLP methods in DG and find that MLP
methods exhibit a better generalization ability because they can better capture
the global representations (e.g., structure) than CNN methods. Then, based on a
recent lightweight MLP method, we obtain a strong baseline that outperforms
most state-of-the-art CNN-based methods. The baseline can learn global
structure representations with a filter to suppress structure irrelevant
information in the frequency space. Moreover, we propose a dynAmic
LOw-Frequency spectrum Transform (ALOFT) that can perturb local texture
features while preserving global structure features, thus enabling the filter
to remove structure-irrelevant information sufficiently. Extensive experiments
on four benchmarks have demonstrated that our method can achieve great
performance improvement with a small number of parameters compared to SOTA
CNN-based DG methods. Our code is available at
https://github.com/lingeringlight/ALOFT/.Comment: Accepted by CVPR2023. The code is available at
https://github.com/lingeringlight/ALOFT
- …