138 research outputs found
Efficient Elastic Net Regularization for Sparse Linear Models
This paper presents an algorithm for efficient training of sparse linear
models with elastic net regularization. Extending previous work on delayed
updates, the new algorithm applies stochastic gradient updates to non-zero
features only, bringing weights current as needed with closed-form updates.
Closed-form delayed updates for the , , and rarely used
regularizers have been described previously. This paper provides
closed-form updates for the popular squared norm and elastic net
regularizers.
We provide dynamic programming algorithms that perform each delayed update in
constant time. The new and elastic net methods handle both fixed and
varying learning rates, and both standard {stochastic gradient descent} (SGD)
and {forward backward splitting (FoBoS)}. Experimental results show that on a
bag-of-words dataset with features, but only nonzero features on
average per training example, the dynamic programming method trains a logistic
regression classifier with elastic net regularization over times faster
than otherwise
Input and Weight Space Smoothing for Semi-supervised Learning
We propose regularizing the empirical loss for semi-supervised learning by
acting on both the input (data) space, and the weight (parameter) space. We
show that the two are not equivalent, and in fact are complementary, one
affecting the minimality of the resulting representation, the other
insensitivity to nuisance variability. We propose a method to perform such
smoothing, which combines known input-space smoothing with a novel weight-space
smoothing, based on a min-max (adversarial) optimization. The resulting
Adversarial Block Coordinate Descent (ABCD) algorithm performs gradient ascent
with a small learning rate for a random subset of the weights, and standard
gradient descent on the remaining weights in the same mini-batch. It achieves
comparable performance to the state-of-the-art without resorting to heavy data
augmentation, using a relatively simple architecture
- β¦