1 research outputs found
Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters
Many key problems in machine learning and data science are routinely modeled
as optimization problems and solved via optimization algorithms. With the
increase of the volume of data and the size and complexity of the statistical
models used to formulate these often ill-conditioned optimization tasks, there
is a need for new efficient algorithms able to cope with these challenges. In
this thesis, we deal with each of these sources of difficulty in a different
way. To efficiently address the big data issue, we develop new methods which in
each iteration examine a small random subset of the training data only. To
handle the big model issue, we develop methods which in each iteration update a
random subset of the model parameters only. Finally, to deal with
ill-conditioned problems, we devise methods that incorporate either
higher-order information or Nesterov's acceleration/momentum. In all cases,
randomness is viewed as a powerful algorithmic tool that we tune, both in
theory and in experiments, to achieve the best results. Our algorithms have
their primary application in training supervised machine learning models via
regularized empirical risk minimization, which is the dominant paradigm for
training such models. However, due to their generality, our methods can be
applied in many other fields, including but not limited to data science,
engineering, scientific computing, and statistics.Comment: PhD thesis, 425 pages, 75 figure