241 research outputs found
PDFO: A Cross-Platform Package for Powell's Derivative-Free Optimization Solvers
The late Professor M. J. D. Powell devised five trust-region derivative-free
optimization methods, namely COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. He
also carefully implemented them into publicly available solvers, which are
renowned for their robustness and efficiency. However, the solvers were
implemented in Fortran 77 and hence may not be easily accessible to some users.
We introduce the PDFO package, which provides user-friendly Python and MATLAB
interfaces to Powell's code. With PDFO, users of such languages can call
Powell's Fortran solvers easily without dealing with the Fortran code.
Moreover, PDFO includes bug fixes and improvements, which are particularly
important for handling problems that suffer from ill-conditioning or failures
of function evaluations. In addition to the PDFO package, we provide an
overview of Powell's methods, sketching them from a uniform perspective,
summarizing their main features, and highlighting the similarities and
interconnections among them. We also present experiments on PDFO to demonstrate
its stability under noise, tolerance of failures in function evaluations, and
potential in solving certain hyperparameter optimization problems
Optimization Algorithms for Machine Learning Designed for Parallel and Distributed Environments
This thesis proposes several optimization methods that utilize parallel algorithms for large-scale machine learning problems. The overall theme is network-based machine learning algorithms; in particular, we consider two machine learning models: graphical models and neural networks. Graphical models are methods categorized under unsupervised machine learning, aiming at recovering conditional dependencies among random variables from observed samples of a multivariable distribution. Neural networks, on the other hand, are methods that learn an implicit approximation to underlying true nonlinear functions based on sample data and utilize that information to generalize to validation data. The goal of finding the best methods relies on an optimization problem tasked with training such models. Improvements in current methods of solving the optimization problem for graphical models are obtained by parallelization and the use of a new update and a new step-size selection rule in the coordinate descent algorithms designed for large-scale problems. For training deep neural networks, we consider the second-order optimization algorithms within trust-region-like optimization frameworks. Deep networks are represented using large-scale vectors of weights and are trained based on very large datasets. Hence, obtaining second-order information is very expensive for these networks. In this thesis, we undertake an extensive exploration of algorithms that use a small number of curvature evaluations and are hence faster than other existing methods
Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis
Recently several methods were proposed for sparse optimization which make
careful use of second-order information [10, 28, 16, 3] to improve local
convergence rates. These methods construct a composite quadratic approximation
using Hessian information, optimize this approximation using a first-order
method, such as coordinate descent and employ a line search to ensure
sufficient descent. Here we propose a general framework, which includes
slightly modified versions of existing algorithms and also a new algorithm,
which uses limited memory BFGS Hessian approximations, and provide a novel
global convergence rate analysis, which covers methods that solve subproblems
via coordinate descent
- …