1 research outputs found
Accelerated Dual Learning by Homotopic Initialization
Gradient descent and coordinate descent are well understood in terms of their
asymptotic behavior, but less so in a transient regime often used for
approximations in machine learning. We investigate how proper initialization
can have a profound effect on finding near-optimal solutions quickly. We show
that a certain property of a data set, namely the boundedness of the
correlations between eigenfeatures and the response variable, can lead to
faster initial progress than expected by commonplace analysis. Convex
optimization problems can tacitly benefit from that, but this automatism does
not apply to their dual formulation. We analyze this phenomenon and devise
provably good initialization strategies for dual optimization as well as
heuristics for the non-convex case, relevant for deep learning. We find our
predictions and methods to be experimentally well-supported