341 research outputs found
Some notes on applying computational divided differencing in optimization
We consider the problem of accurate computation of the finite difference
f(\x+\s)-f(\x) when \Vert\s\Vert is very small. Direct evaluation of this
difference in floating point arithmetic succumbs to cancellation error and
yields 0 when \s is sufficiently small. Nonetheless, accurate computation of
this finite difference is required by many optimization algorithms for a
"sufficient decrease" test. Reps and Rall proposed a programmatic
transformation called "computational divided differencing" reminiscent of
automatic differentiation to compute these differences with high accuracy. The
running time to compute the difference is a small constant multiple of the
running time to compute . Unlike automatic differentiation, however, the
technique is not fully general because of a difficulty with branching code
(i.e., `if' statements). We make several remarks about the application of
computational divided differencing to optimization. One point is that the
technique can be used effectively as a stagnation test
On the complexity of nonnegative matrix factorization
Nonnegative matrix factorization (NMF) has become a prominent technique for
the analysis of image databases, text databases and other information retrieval
and clustering applications. In this report, we define an exact version of NMF.
Then we establish several results about exact NMF: (1) that it is equivalent to
a problem in polyhedral combinatorics; (2) that it is NP-hard; and (3) that a
polynomial-time local search heuristic exists.Comment: Version 2 corrects small typos; adds ref to Cohen & Rothblum; adds
ref to Gillis; clarifies reduction of NMF to int. simple
A conjecture that the roots of a univariate polynomial lie in a union of annuli
We conjecture that the roots of a degree-n univariate complex polynomial are
located in a union of n-1 annuli, each of which is centered at a root of the
derivative and whose radii depend on higher derivatives. We prove the
conjecture for the cases of degrees 2 and 3, and we report on tests with
randomly generated polynomials of higher degree.
We state two other closely related conjectures concerning Newton's method. If
true, these conjectures imply the existence of a simple, rapidly convergent
algorithm for finding all roots of a polynomial.Comment: Conjecture 1 in the original version has been resolved. This interim
version adds a note to that effec
Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) under the separability assumption can
provably be solved efficiently, even in the presence of noise, and has been
shown to be a powerful technique in document classification and hyperspectral
unmixing. This problem is referred to as near-separable NMF and requires that
there exists a cone spanned by a small subset of the columns of the input
nonnegative matrix approximately containing all columns. In this paper, we
propose a preconditioning based on semidefinite programming making the input
matrix well-conditioned. This in turn can improve significantly the performance
of near-separable NMF algorithms which is illustrated on the popular successive
projection algorithm (SPA). The new preconditioned SPA is provably more robust
to noise, and outperforms SPA on several synthetic data sets. We also show how
an active-set method allow us to apply the preconditioning on large-scale
real-world hyperspectral images.Comment: 25 pages, 6 figures, 4 tables. New numerical experiments, additional
remarks and comment
A Fully Sparse Implementation of a Primal-Dual Interior-Point Potential Reduction Method for Semidefinite Programming
In this paper, we show a way to exploit sparsity in the problem data in a
primal-dual potential reduction method for solving a class of semidefinite
programs. When the problem data is sparse, the dual variable is also sparse,
but the primal one is not. To avoid working with the dense primal variable, we
apply Fukuda et al.'s theory of partial matrix completion and work with partial
matrices instead. The other place in the algorithm where sparsity should be
exploited is in the computation of the search direction, where the gradient and
the Hessian-matrix product of the primal and dual barrier functions must be
computed in every iteration. By using an idea from automatic differentiation in
backward mode, both the gradient and the Hessian-matrix product can be computed
in time proportional to the time needed to compute the barrier functions of
sparse variables itself. Moreover, the high space complexity that is normally
associated with the use of automatic differentiation in backward mode can be
avoided in this case. In addition, we suggest a technique to efficiently
compute the determinant of the positive definite matrix completion that is
required to compute primal search directions. The method of obtaining one of
the primal search directions that minimizes the number of the evaluations of
the determinant of the positive definite completion is also proposed. We then
implement the algorithm and test it on the problem of finding the maximum cut
of a graph
Properties of polynomial bases used in a line-surface intersection algorithm
In [5], Srijuntongsiri and Vavasis propose the "Kantorovich-Test Subdivision
algorithm", or KTS, which is an algorithm for finding all zeros of a polynomial
system in a bounded region of the plane. This algorithm can be used to find the
intersections between a line and a surface. The main features of KTS are that
it can operate on polynomials represented in any basis that satisfies certain
conditions and that its efficiency has an upper bound that depends only on the
conditioning of the problem and the choice of the basis representing the
polynomial system.
This article explores in detail the dependence of the efficiency of the KTS
algorithm on the choice of basis. Three bases are considered: the power, the
Bernstein, and the Chebyshev bases. These three bases satisfy the basis
properties required by KTS. Theoretically, Chebyshev case has the smallest
upper bound on its running time. The computational results, however, do not
show that Chebyshev case performs better than the other two
Potential-based analyses of first-order methods for constrained and composite optimization
We propose potential-based analyses for first-order algorithms applied to
constrained and composite minimization problems. We first propose ``idealized''
frameworks for algorithms in the strongly and non-strongly convex cases and
argue based on a potential that methods following the framework achieve the
best possible rate. Then we show that the geometric descent (GD) algorithm by
Bubeck et al.\ as extended to the constrained and composite setting by Chen et
al.\ achieves this rate using the potential-based analysis for the strongly
convex case. Next, we extend the GD algorithm to the case of non-strongly
convex problems. We show using a related potential-based argument that our
extension achieves the best possible rate in this case as well. The new GD
algorithm achieves the best possible rate in the nonconvex case also. We also
analyze accelerated gradient using the new potentials.
We then turn to the special case of a quadratic function with a single ball
constraint, the famous trust-region subproblem. For this case, the first-order
trust-region Lanczos method by Gould et al.\ finds the optimal point in an
increasing sequence of Krylov spaces. Our results for the general case
immediately imply convergence rates for their method in both the strongly
convex and non-strongly convex cases. We also establish the same convergence
rates for their method using arguments based on Chebyshev polynomial
approximation. To the best of our knowledge, no convergence rate has previously
been established for the trust-region Lanczos method
A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent
Nesterov's accelerated gradient (AG) method for minimizing a smooth strongly
convex function is known to reduce by a factor
of after iterations,
where are the two parameters of smooth strong convexity. Furthermore,
it is known that this is the best possible complexity in the function-gradient
oracle model of computation. Modulo a line search, the geometric descent (GD)
method of Bubeck, Lee and Singh has the same bound for this class of functions.
The method of linear conjugate gradients (CG) also satisfies the same
complexity bound in the special case of strongly convex quadratic functions,
but in this special case it can be faster than the AG and GD methods.
Despite similarities in the algorithms and their asymptotic convergence
rates, the conventional analysis of the running time of CG is mostly disjoint
from that of AG and GD. The analyses of the AG and GD methods are also rather
distinct.
Our main result is analyses of the three methods that share several common
threads: all three analyses show a relationship to a certain "idealized
algorithm", all three establish the convergence rate through the use of the
Bubeck-Lee-Singh geometric lemma, and all three have the same potential that is
computable at run-time and exhibits decrease by a factor of
or better per iteration.
One application of these analyses is that they open the possibility of hybrid
or intermediate algorithms. One such algorithm is proposed herein and is shown
to perform well in computational tests
Detecting and correcting the loss of independence in nonlinear conjugate gradient
It is well known that search directions in nonlinear conjugate gradient (CG)
can sometimes become nearly dependent, causing a dramatic slow-down in the
convergence rate. We provide a theoretical analysis of this loss of
independence. The analysis applies to the case of a strictly convex objective
function and is motivated by older work of Nemirovsky and Yudin. Loss of
independence can affect several of the well-known variants of nonlinear CG
including Fletcher-Reeves, Polak-Ribi\`ere (nonnegative variant), and
Hager-Zhang.
Based on our analysis, we propose a relatively inexpensive computational test
for detecting loss of independence. We also propose a method for correcting it
when it is detected, which we call "subspace optimization." Although the
correction method is somewhat expensive, our experiments show that in some
cases, usually the most ill-conditioned ones, it yields a method much faster
than any of these three variants. Even though our theory covers only strongly
convex objective functions, we provide computational results to indicate that
the detection and correction mechanisms may also hold promise for nonconvex
optimization
IMRO: a proximal quasi-Newton method for solving -regularized least square problem
We present a proximal quasi-Newton method in which the approximation of the
Hessian has the special format of "identity minus rank one" (IMRO) in each
iteration. The proposed structure enables us to effectively recover the
proximal point. The algorithm is applied to -regularized least square
problem arising in many applications including sparse recovery in compressive
sensing, machine learning and statistics. Our numerical experiment suggests
that the proposed technique competes favourably with other state-of-the-art
solvers for this class of problems. We also provide a complexity analysis for
variants of IMRO, showing that it matches known best bounds
- β¦