158 research outputs found
Data Dropout in Arbitrary Basis for Deep Network Regularization
An important problem in training deep networks with high capacity is to
ensure that the trained network works well when presented with new inputs
outside the training dataset. Dropout is an effective regularization technique
to boost the network generalization in which a random subset of the elements of
the given data and the extracted features are set to zero during the training
process. In this paper, a new randomized regularization technique in which we
withhold a random part of the data without necessarily turning off the
neurons/data-elements is proposed. In the proposed method, of which the
conventional dropout is shown to be a special case, random data dropout is
performed in an arbitrary basis, hence the designation Generalized Dropout. We
also present a framework whereby the proposed technique can be applied
efficiently to convolutional neural networks. The presented numerical
experiments demonstrate that the proposed technique yields notable performance
gain. Generalized Dropout provides new insight into the idea of dropout, shows
that we can achieve different performance gains by using different bases
matrices, and opens up a new research question as of how to choose optimal
bases matrices that achieve maximal performance gain
Convolutional Neural Networks for Sentence Classification
We report on a series of experiments with convolutional neural networks (CNN)
trained on top of pre-trained word vectors for sentence-level classification
tasks. We show that a simple CNN with little hyperparameter tuning and static
vectors achieves excellent results on multiple benchmarks. Learning
task-specific vectors through fine-tuning offers further gains in performance.
We additionally propose a simple modification to the architecture to allow for
the use of both task-specific and static vectors. The CNN models discussed
herein improve upon the state of the art on 4 out of 7 tasks, which include
sentiment analysis and question classification.Comment: To appear in EMNLP 201
Excitation Dropout: Encouraging Plasticity in Deep Neural Networks
We propose a guided dropout regularizer for deep networks based on the
evidence of a network prediction defined as the firing of neurons in specific
paths. In this work, we utilize the evidence at each neuron to determine the
probability of dropout, rather than dropping out neurons uniformly at random as
in standard dropout. In essence, we dropout with higher probability those
neurons which contribute more to decision making at training time. This
approach penalizes high saliency neurons that are most relevant for model
prediction, i.e. those having stronger evidence. By dropping such high-saliency
neurons, the network is forced to learn alternative paths in order to maintain
loss minimization, resulting in a plasticity-like behavior, a characteristic of
human brains too. We demonstrate better generalization ability, an increased
utilization of network neurons, and a higher resilience to network compression
using several metrics over four image/video recognition benchmarks
- …