Search CORE

6,596 research outputs found

Input and Weight Space Smoothing for Semi-supervised Learning

Author: Cicek Safa
Soatto Stefano
Publication venue
Publication date: 23/05/2018
Field of study

We propose regularizing the empirical loss for semi-supervised learning by acting on both the input (data) space, and the weight (parameter) space. We show that the two are not equivalent, and in fact are complementary, one affecting the minimality of the resulting representation, the other insensitivity to nuisance variability. We propose a method to perform such smoothing, which combines known input-space smoothing with a novel weight-space smoothing, based on a min-max (adversarial) optimization. The resulting Adversarial Block Coordinate Descent (ABCD) algorithm performs gradient ascent with a small learning rate for a random subset of the weights, and standard gradient descent on the remaining weights in the same mini-batch. It achieves comparable performance to the state-of-the-art without resorting to heavy data augmentation, using a relatively simple architecture

arXiv.org e-Print Archive

Crossref

Generalization Error in Deep Learning

Author: D McAllester
D Vainsencher
DA McAllester
Daniel Jakubovitz
Huan Xu
J Bruna
J Sokolic
K Schnass
M Anthony
N Akhtar
PL Bartlett
PL Bartlett
R Gribonval
R Gribonval
S Shalev-Shwartz
SJ Pan
TM Cover
V Papyan
Publication venue
Publication date: 06/04/2019
Field of study

Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results

arXiv.org e-Print Archive

Crossref

UCL Discovery