11,465 research outputs found
Dropout Training as Adaptive Regularization
Dropout and other feature noising schemes control overfitting by artificially
corrupting the training data. For generalized linear models, dropout performs a
form of adaptive regularization. Using this viewpoint, we show that the dropout
regularizer is first-order equivalent to an L2 regularizer applied after
scaling the features by an estimate of the inverse diagonal Fisher information
matrix. We also establish a connection to AdaGrad, an online learning
algorithm, and find that a close relative of AdaGrad operates by repeatedly
solving linear dropout-regularized problems. By casting dropout as
regularization, we develop a natural semi-supervised algorithm that uses
unlabeled data to create a better adaptive regularizer. We apply this idea to
document classification tasks, and show that it consistently boosts the
performance of dropout training, improving on state-of-the-art results on the
IMDB reviews dataset.Comment: 11 pages. Advances in Neural Information Processing Systems (NIPS),
201
Lyapunov-Based Dropout Deep Neural Network (Lb-DDNN) Controller
Deep neural network (DNN)-based adaptive controllers can be used to
compensate for unstructured uncertainties in nonlinear dynamic systems.
However, DNNs are also very susceptible to overfitting and co-adaptation.
Dropout regularization is an approach where nodes are randomly dropped during
training to alleviate issues such as overfitting and co-adaptation. In this
paper, a dropout DNN-based adaptive controller is developed. The developed
dropout technique allows the deactivation of weights that are stochastically
selected for each individual layer within the DNN. Simultaneously, a
Lyapunov-based real-time weight adaptation law is introduced to update the
weights of all layers of the DNN for online unsupervised learning. A non-smooth
Lyapunov-based stability analysis is performed to ensure asymptotic convergence
of the tracking error. Simulation results of the developed dropout DNN-based
adaptive controller indicate a 38.32% improvement in the tracking error, a
53.67% improvement in the function approximation error, and 50.44% lower
control effort when compared to a baseline adaptive DNN-based controller
without dropout regularization
Curriculum Dropout
Dropout is a very effective way of regularizing neural networks.
Stochastically "dropping out" units with a certain probability discourages
over-specific co-adaptations of feature detectors, preventing overfitting and
improving network generalization. Besides, Dropout can be interpreted as an
approximate model aggregation technique, where an exponential number of smaller
networks are averaged in order to get a more powerful ensemble. In this paper,
we show that using a fixed dropout probability during training is a suboptimal
choice. We thus propose a time scheduling for the probability of retaining
neurons in the network. This induces an adaptive regularization scheme that
smoothly increases the difficulty of the optimization problem. This idea of
"starting easy" and adaptively increasing the difficulty of the learning
problem has its roots in curriculum learning and allows one to train better
models. Indeed, we prove that our optimization strategy implements a very
general curriculum scheme, by gradually adding noise to both the input and
intermediate feature representations within the network architecture.
Experiments on seven image classification datasets and different network
architectures show that our method, named Curriculum Dropout, frequently yields
to better generalization and, at worst, performs just as well as the standard
Dropout method.Comment: Accepted at ICCV (International Conference on Computer Vision) 201
- …