241 research outputs found

    PDFO: A Cross-Platform Package for Powell's Derivative-Free Optimization Solvers

    Full text link
    The late Professor M. J. D. Powell devised five trust-region derivative-free optimization methods, namely COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. He also carefully implemented them into publicly available solvers, which are renowned for their robustness and efficiency. However, the solvers were implemented in Fortran 77 and hence may not be easily accessible to some users. We introduce the PDFO package, which provides user-friendly Python and MATLAB interfaces to Powell's code. With PDFO, users of such languages can call Powell's Fortran solvers easily without dealing with the Fortran code. Moreover, PDFO includes bug fixes and improvements, which are particularly important for handling problems that suffer from ill-conditioning or failures of function evaluations. In addition to the PDFO package, we provide an overview of Powell's methods, sketching them from a uniform perspective, summarizing their main features, and highlighting the similarities and interconnections among them. We also present experiments on PDFO to demonstrate its stability under noise, tolerance of failures in function evaluations, and potential in solving certain hyperparameter optimization problems

    Optimization Algorithms for Machine Learning Designed for Parallel and Distributed Environments

    Get PDF
    This thesis proposes several optimization methods that utilize parallel algorithms for large-scale machine learning problems. The overall theme is network-based machine learning algorithms; in particular, we consider two machine learning models: graphical models and neural networks. Graphical models are methods categorized under unsupervised machine learning, aiming at recovering conditional dependencies among random variables from observed samples of a multivariable distribution. Neural networks, on the other hand, are methods that learn an implicit approximation to underlying true nonlinear functions based on sample data and utilize that information to generalize to validation data. The goal of finding the best methods relies on an optimization problem tasked with training such models. Improvements in current methods of solving the optimization problem for graphical models are obtained by parallelization and the use of a new update and a new step-size selection rule in the coordinate descent algorithms designed for large-scale problems. For training deep neural networks, we consider the second-order optimization algorithms within trust-region-like optimization frameworks. Deep networks are represented using large-scale vectors of weights and are trained based on very large datasets. Hence, obtaining second-order information is very expensive for these networks. In this thesis, we undertake an extensive exploration of algorithms that use a small number of curvature evaluations and are hence faster than other existing methods

    Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis

    Full text link
    Recently several methods were proposed for sparse optimization which make careful use of second-order information [10, 28, 16, 3] to improve local convergence rates. These methods construct a composite quadratic approximation using Hessian information, optimize this approximation using a first-order method, such as coordinate descent and employ a line search to ensure sufficient descent. Here we propose a general framework, which includes slightly modified versions of existing algorithms and also a new algorithm, which uses limited memory BFGS Hessian approximations, and provide a novel global convergence rate analysis, which covers methods that solve subproblems via coordinate descent
    • …
    corecore