Subspace optimization for machine learning

Abstract

Many machine learning (ML) problems are solved by iteratively optimizing an objective along a single descent direction. By restricting the search for the next iterate to one direction, per iteration cost is kept low. In many important problem classes, however, it is possible to perform subspace optimization (SO) and search along additional directions nearly "for free". This can lead to finding a better next iterate and result in fewer iterations to reach a solution. In theory and in practice, however, SO is rarely used and existing theory usually assumes an exact SO. In this thesis, we show experimentally that SO could lead to more accurate solutions in less time. We also explore the effect of different SO parameter settings. The experiments are performed through a software package we developed called minFuncLinear which is available for download. On the theory side, we propose a more abstract problem class, multilinear map problems, where SO could be performed efficiently. This encompasses linear composition models, known to allow efficient SO. We give further examples of problems that fall under this framework, such as log-determinant problems. We also provide preliminary convergence analysis of gradient descent (GD) combined with SO, discuss multi-dimensional Wolfe conditions and extend initialization and interpolation methods from linesearches to subspace searches.Science, Faculty ofComputer Science, Department ofGraduat

    Similar works