449 research outputs found
Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks
Deep neural networks often consist of a great number of trainable parameters
for extracting powerful features from given datasets. On one hand, massive
trainable parameters significantly enhance the performance of these deep
networks. On the other hand, they bring the problem of over-fitting. To this
end, dropout based methods disable some elements in the output feature maps
during the training phase for reducing the co-adaptation of neurons. Although
the generalization ability of the resulting models can be enhanced by these
approaches, the conventional binary dropout is not the optimal solution.
Therefore, we investigate the empirical Rademacher complexity related to
intermediate layers of deep neural networks and propose a feature distortion
method (Disout) for addressing the aforementioned problem. In the training
period, randomly selected elements in the feature maps will be replaced with
specific values by exploiting the generalization error bound. The superiority
of the proposed feature map distortion for producing deep neural network with
higher testing performance is analyzed and demonstrated on several benchmark
image datasets
Universum Prescription: Regularization using Unlabeled Data
This paper shows that simply prescribing "none of the above" labels to
unlabeled data has a beneficial regularization effect to supervised learning.
We call it universum prescription by the fact that the prescribed labels cannot
be one of the supervised labels. In spite of its simplicity, universum
prescription obtained competitive results in training deep convolutional
networks for CIFAR-10, CIFAR-100, STL-10 and ImageNet datasets. A qualitative
justification of these approaches using Rademacher complexity is presented. The
effect of a regularization parameter -- probability of sampling from unlabeled
data -- is also studied empirically.Comment: 7 pages for article, 3 pages for supplemental material. To appear in
AAAI-1
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
- …