341 research outputs found

    Some notes on applying computational divided differencing in optimization

    Full text link
    We consider the problem of accurate computation of the finite difference f(\x+\s)-f(\x) when \Vert\s\Vert is very small. Direct evaluation of this difference in floating point arithmetic succumbs to cancellation error and yields 0 when \s is sufficiently small. Nonetheless, accurate computation of this finite difference is required by many optimization algorithms for a "sufficient decrease" test. Reps and Rall proposed a programmatic transformation called "computational divided differencing" reminiscent of automatic differentiation to compute these differences with high accuracy. The running time to compute the difference is a small constant multiple of the running time to compute ff. Unlike automatic differentiation, however, the technique is not fully general because of a difficulty with branching code (i.e., `if' statements). We make several remarks about the application of computational divided differencing to optimization. One point is that the technique can be used effectively as a stagnation test

    On the complexity of nonnegative matrix factorization

    Full text link
    Nonnegative matrix factorization (NMF) has become a prominent technique for the analysis of image databases, text databases and other information retrieval and clustering applications. In this report, we define an exact version of NMF. Then we establish several results about exact NMF: (1) that it is equivalent to a problem in polyhedral combinatorics; (2) that it is NP-hard; and (3) that a polynomial-time local search heuristic exists.Comment: Version 2 corrects small typos; adds ref to Cohen & Rothblum; adds ref to Gillis; clarifies reduction of NMF to int. simple

    A conjecture that the roots of a univariate polynomial lie in a union of annuli

    Full text link
    We conjecture that the roots of a degree-n univariate complex polynomial are located in a union of n-1 annuli, each of which is centered at a root of the derivative and whose radii depend on higher derivatives. We prove the conjecture for the cases of degrees 2 and 3, and we report on tests with randomly generated polynomials of higher degree. We state two other closely related conjectures concerning Newton's method. If true, these conjectures imply the existence of a simple, rapidly convergent algorithm for finding all roots of a polynomial.Comment: Conjecture 1 in the original version has been resolved. This interim version adds a note to that effec

    Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization

    Full text link
    Nonnegative matrix factorization (NMF) under the separability assumption can provably be solved efficiently, even in the presence of noise, and has been shown to be a powerful technique in document classification and hyperspectral unmixing. This problem is referred to as near-separable NMF and requires that there exists a cone spanned by a small subset of the columns of the input nonnegative matrix approximately containing all columns. In this paper, we propose a preconditioning based on semidefinite programming making the input matrix well-conditioned. This in turn can improve significantly the performance of near-separable NMF algorithms which is illustrated on the popular successive projection algorithm (SPA). The new preconditioned SPA is provably more robust to noise, and outperforms SPA on several synthetic data sets. We also show how an active-set method allow us to apply the preconditioning on large-scale real-world hyperspectral images.Comment: 25 pages, 6 figures, 4 tables. New numerical experiments, additional remarks and comment

    A Fully Sparse Implementation of a Primal-Dual Interior-Point Potential Reduction Method for Semidefinite Programming

    Full text link
    In this paper, we show a way to exploit sparsity in the problem data in a primal-dual potential reduction method for solving a class of semidefinite programs. When the problem data is sparse, the dual variable is also sparse, but the primal one is not. To avoid working with the dense primal variable, we apply Fukuda et al.'s theory of partial matrix completion and work with partial matrices instead. The other place in the algorithm where sparsity should be exploited is in the computation of the search direction, where the gradient and the Hessian-matrix product of the primal and dual barrier functions must be computed in every iteration. By using an idea from automatic differentiation in backward mode, both the gradient and the Hessian-matrix product can be computed in time proportional to the time needed to compute the barrier functions of sparse variables itself. Moreover, the high space complexity that is normally associated with the use of automatic differentiation in backward mode can be avoided in this case. In addition, we suggest a technique to efficiently compute the determinant of the positive definite matrix completion that is required to compute primal search directions. The method of obtaining one of the primal search directions that minimizes the number of the evaluations of the determinant of the positive definite completion is also proposed. We then implement the algorithm and test it on the problem of finding the maximum cut of a graph

    Properties of polynomial bases used in a line-surface intersection algorithm

    Full text link
    In [5], Srijuntongsiri and Vavasis propose the "Kantorovich-Test Subdivision algorithm", or KTS, which is an algorithm for finding all zeros of a polynomial system in a bounded region of the plane. This algorithm can be used to find the intersections between a line and a surface. The main features of KTS are that it can operate on polynomials represented in any basis that satisfies certain conditions and that its efficiency has an upper bound that depends only on the conditioning of the problem and the choice of the basis representing the polynomial system. This article explores in detail the dependence of the efficiency of the KTS algorithm on the choice of basis. Three bases are considered: the power, the Bernstein, and the Chebyshev bases. These three bases satisfy the basis properties required by KTS. Theoretically, Chebyshev case has the smallest upper bound on its running time. The computational results, however, do not show that Chebyshev case performs better than the other two

    Potential-based analyses of first-order methods for constrained and composite optimization

    Full text link
    We propose potential-based analyses for first-order algorithms applied to constrained and composite minimization problems. We first propose ``idealized'' frameworks for algorithms in the strongly and non-strongly convex cases and argue based on a potential that methods following the framework achieve the best possible rate. Then we show that the geometric descent (GD) algorithm by Bubeck et al.\ as extended to the constrained and composite setting by Chen et al.\ achieves this rate using the potential-based analysis for the strongly convex case. Next, we extend the GD algorithm to the case of non-strongly convex problems. We show using a related potential-based argument that our extension achieves the best possible rate in this case as well. The new GD algorithm achieves the best possible rate in the nonconvex case also. We also analyze accelerated gradient using the new potentials. We then turn to the special case of a quadratic function with a single ball constraint, the famous trust-region subproblem. For this case, the first-order trust-region Lanczos method by Gould et al.\ finds the optimal point in an increasing sequence of Krylov spaces. Our results for the general case immediately imply convergence rates for their method in both the strongly convex and non-strongly convex cases. We also establish the same convergence rates for their method using arguments based on Chebyshev polynomial approximation. To the best of our knowledge, no convergence rate has previously been established for the trust-region Lanczos method

    A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent

    Full text link
    Nesterov's accelerated gradient (AG) method for minimizing a smooth strongly convex function ff is known to reduce f(xk)βˆ’f(xβˆ—)f({\bf x}_k)-f({\bf x}^*) by a factor of ϡ∈(0,1)\epsilon\in(0,1) after k=O(L/β„“log⁑(1/Ο΅))k=O(\sqrt{L/\ell}\log(1/\epsilon)) iterations, where β„“,L\ell,L are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. Modulo a line search, the geometric descent (GD) method of Bubeck, Lee and Singh has the same bound for this class of functions. The method of linear conjugate gradients (CG) also satisfies the same complexity bound in the special case of strongly convex quadratic functions, but in this special case it can be faster than the AG and GD methods. Despite similarities in the algorithms and their asymptotic convergence rates, the conventional analysis of the running time of CG is mostly disjoint from that of AG and GD. The analyses of the AG and GD methods are also rather distinct. Our main result is analyses of the three methods that share several common threads: all three analyses show a relationship to a certain "idealized algorithm", all three establish the convergence rate through the use of the Bubeck-Lee-Singh geometric lemma, and all three have the same potential that is computable at run-time and exhibits decrease by a factor of 1βˆ’β„“/L1-\sqrt{\ell/L} or better per iteration. One application of these analyses is that they open the possibility of hybrid or intermediate algorithms. One such algorithm is proposed herein and is shown to perform well in computational tests

    Detecting and correcting the loss of independence in nonlinear conjugate gradient

    Full text link
    It is well known that search directions in nonlinear conjugate gradient (CG) can sometimes become nearly dependent, causing a dramatic slow-down in the convergence rate. We provide a theoretical analysis of this loss of independence. The analysis applies to the case of a strictly convex objective function and is motivated by older work of Nemirovsky and Yudin. Loss of independence can affect several of the well-known variants of nonlinear CG including Fletcher-Reeves, Polak-Ribi\`ere (nonnegative variant), and Hager-Zhang. Based on our analysis, we propose a relatively inexpensive computational test for detecting loss of independence. We also propose a method for correcting it when it is detected, which we call "subspace optimization." Although the correction method is somewhat expensive, our experiments show that in some cases, usually the most ill-conditioned ones, it yields a method much faster than any of these three variants. Even though our theory covers only strongly convex objective functions, we provide computational results to indicate that the detection and correction mechanisms may also hold promise for nonconvex optimization

    IMRO: a proximal quasi-Newton method for solving l1l_1-regularized least square problem

    Full text link
    We present a proximal quasi-Newton method in which the approximation of the Hessian has the special format of "identity minus rank one" (IMRO) in each iteration. The proposed structure enables us to effectively recover the proximal point. The algorithm is applied to l1l_1-regularized least square problem arising in many applications including sparse recovery in compressive sensing, machine learning and statistics. Our numerical experiment suggests that the proposed technique competes favourably with other state-of-the-art solvers for this class of problems. We also provide a complexity analysis for variants of IMRO, showing that it matches known best bounds
    • …
    corecore