Algorithms for Nonconvex Optimization Problems in Machine Learning and Statistics

Abstract

The purpose of this thesis is the design of algorithms that can be used to determine optimal solutions to nonconvex data approximation problems. In Part I of this thesis, we consider a very general class of nonconvex and large-scale data approximation problems and devise an algorithm that efficiently computes locally optimal solutions to these problems. As a type of trust-region Newton-CG method, the algorithm can make use of directions of negative curvature to escape saddle points, which otherwise might slow down the optimization process when solving nonconvex problems. We present results of numerical experiments on convex and nonconvex problems which support our claim that our algorithm has significant advantages compared to methods like stochastic gradient descent and its variance-reduced versions. In Part II we consider the univariate least-squares spline approximation problem with free knots, which is known to possess a large number of locally minimal points far from the globally optimal solution. Since in typical applications, neither the dimension of the decision variable nor the number of data points is particularly large, it is possible to make use of the specific problem structure in order to devise algorithmic approaches to approximate the globally optimal solution of problem instances of relevant sizes. We propose to approximate the continuous original problem with a combinatorial optimization problem, and investigate two algorithmic approaches for the computation of the optimal solution of the latter

    Similar works