6,596 research outputs found
Input and Weight Space Smoothing for Semi-supervised Learning
We propose regularizing the empirical loss for semi-supervised learning by
acting on both the input (data) space, and the weight (parameter) space. We
show that the two are not equivalent, and in fact are complementary, one
affecting the minimality of the resulting representation, the other
insensitivity to nuisance variability. We propose a method to perform such
smoothing, which combines known input-space smoothing with a novel weight-space
smoothing, based on a min-max (adversarial) optimization. The resulting
Adversarial Block Coordinate Descent (ABCD) algorithm performs gradient ascent
with a small learning rate for a random subset of the weights, and standard
gradient descent on the remaining weights in the same mini-batch. It achieves
comparable performance to the state-of-the-art without resorting to heavy data
augmentation, using a relatively simple architecture
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
- …