2,627 research outputs found
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
Continual Invariant Risk Minimization
Empirical risk minimization can lead to poor generalization behavior on
unseen environments if the learned model does not capture invariant feature
representations. Invariant risk minimization (IRM) is a recent proposal for
discovering environment-invariant representations. IRM was introduced by
Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that
all environments are available to the learning system at the same time. With
this work, we generalize the concept of IRM to scenarios where environments are
observed sequentially. We show that existing approaches, including those
designed for continual learning, fail to identify the invariant features and
models across sequentially presented environments. We extend IRM under a
variational Bayesian and bilevel framework, creating a general approach to
continual invariant risk minimization. We also describe a strategy to solve the
optimization problems using a variant of the alternating direction method of
multiplier (ADMM). We show empirically using multiple datasets and with
multiple sequential environments that the proposed methods outperform or is
competitive with prior approaches.Comment: Shorter version of this paper was presented at RobustML workshop of
ICLR 202
- …