48 research outputs found
A short survey on Kantorovich-like theorems for Newton's method
We survey influential quantitative results on the convergence of the Newton iterator towards simple roots of continuously differentiable maps defined over Banach spaces. We present a general statement of Kantorovich's theorem, with a concise proof from scratch, dedicated to wide audience. From it, we quickly recover known results, and gather historical notes together with pointers to recent articles
An inexact -order regularized proximal Newton method for nonconvex composite optimization
This paper concerns the composite problem of minimizing the sum of a twice
continuously differentiable function and a nonsmooth convex function. For
this class of nonconvex and nonsmooth problems, by leveraging a practical
inexactness criterion and a novel selection strategy for iterates, we propose
an inexact -order regularized proximal Newton method, which becomes
an inexact cubic regularization (CR) method for . We justify that its
iterate sequence converges to a stationary point for the KL objective function,
and if the objective function has the KL property of exponent
, the convergence has a local -superlinear rate
of order . In particular, under a locally H\"{o}lderian
error bound of order on a second-order stationary
point set, the iterate sequence converges to a second-order stationary point
with a local -superlinear rate of order , which is
specified as -quadratic rate for and . This is the first
practical inexact CR method with -quadratic convergence rate for nonconvex
composite optimization. We validate the efficiency of the proposed method with
ZeroFPR as the solver of subproblems by applying it to convex and nonconvex
composite problems with a highly nonlinear
Iterative Linear Algebra for Parameter Estimation
The principal goal of this thesis is the development and analysis of effcient numerical
methods for large-scale nonlinear parameter estimation problems. These problems are of
high relevance in all sciences that predict the future using big data sets of the past by
fitting and then extrapolating a mathematical model. This thesis is concerned with the
fitting part. The challenges lie in the treatment of the nonlinearities and the sheer size of
the data and the unknowns. The state-of-the-art for the numerical solution of parameter
estimation problems is the Gauss-Newton method, which solves a sequence of linearized
subproblems.
One of the contributions of this thesis is a thorough analysis of the problem class on
the basis of covariant and contravariant k-theory. Based on this analysis, it is possible
to devise a new stopping criterion for the iterative solution of the inner linearized subproblems.
The analysis reveals that the inner subproblems can be solved with only low
accuracy without impeding the speed of convergence of the outer iteration dramatically.
In addition, I prove that this new stopping criterion is a quantitative measure of how
accurate the solution of the subproblems needs to be in order to produce inexact Gauss-
Newton sequences that converge to a statistically stable estimate provided that at least
one exists. Thus, this new local approach results to be an inexact Gauss-Newton method
that requires far less inner iterations for computing the inexact Gauss-Newton step than
the classical exact Gauss-Newton method based on factorization algorithm for computing
the Gauss-Newton step that requires to perform 100% of the inner iterations, which is
computationally prohibitively expensive when the number of parameters to be estimated
is large. Furthermore, we generalize the local ideas of this local inexact Gauss-Newton
approach, and introduce a damped inexact Gauss-Newton method using the Backward
Step Control for global Newton-type theory of Potschka.
We evaluate the efficiency of our new approach using two examples. The first one
is a parameter identification of a nonlinear elliptical partial differential equation, and
the second one is a real world parameter estimation on a large-scale bundle adjustment
problem. Both of those examples are ill conditioned. Thus, a convenient regularization
in each one is considered. Our experimental results show that this new inexact Gauss-
Newton approach requires less than 3% of the inner iterations for computing the inexact
Gauss-Newton step in order to converge to a statistically stable estimate
A trust region-type normal map-based semismooth Newton method for nonsmooth nonconvex composite optimization
We propose a novel trust region method for solving a class of nonsmooth and
nonconvex composite-type optimization problems. The approach embeds inexact
semismooth Newton steps for finding zeros of a normal map-based stationarity
measure for the problem in a trust region framework. Based on a new merit
function and acceptance mechanism, global convergence and transition to fast
local q-superlinear convergence are established under standard conditions. In
addition, we verify that the proposed trust region globalization is compatible
with the Kurdyka-{\L}ojasiewicz (KL) inequality yielding finer convergence
results. We further derive new normal map-based representations of the
associated second-order optimality conditions that have direct connections to
the local assumptions required for fast convergence. Finally, we study the
behavior of our algorithm when the Hessian matrix of the smooth part of the
objective function is approximated by BFGS updates. We successfully link the KL
theory, properties of the BFGS approximations, and a Dennis-Mor{\'e}-type
condition to show superlinear convergence of the quasi-Newton version of our
method. Numerical experiments on sparse logistic regression and image
compression illustrate the efficiency of the proposed algorithm.Comment: 56 page
Gradient descent-type methods: Background and simple unified convergence analysis
In this book chapter, we briefly describe the main components that constitute the
gradient descent method and its accelerated and stochastic variants. We aim at explaining
these components from a mathematical point of view, including theoretical and practical
aspects, but at an elementary level. We will focus on basic variants of the gradient descent
method and then extend our view to recent variants, especially variance-reduced stochastic
gradient schemes (SGD). Our approach relies on revealing the structures presented inside
the problem and the assumptions imposed on the objective function. Our convergence
analysis unifies several known results and relies on a general, but elementary recursive
expression. We have illustrated this analysis on several common schemes
Efficient and Globally Convergent Minimization Algorithms for Small- and Finite-Strain Plasticity Problems
We present efficient and globally convergent solvers for several classes of plasticity models. The models in this work are formulated in the primal form as energetic rate-independent systems with an elastic energy potential and a plastic dissipation component. Different hardening rules are considered, as well as different flow rules. The time discretization leads to a sequence of nonsmooth minimization problems. For small strains, the unknowns live in vector spaces while for finite strains we have to deal with manifold-valued quantities. For the latter, a reformulation in tangent space is performed to end up with the same dissipation functional as in the small-strain case. We present the Newton-type TNNMG solver for convex and nonsmooth minimization problems and a newly developed Proximal Newton (PN) method that can also handle nonconvex problems. The PN method generates a sequence of penalized convex, coercive but nonsmooth subproblems. These subproblems are in the form of block-separable small-strain plasticity problems, to which TNNMG can be applied. Global convergence theorems are available for both methods. In several numerical experiments, both the efficiency and the flexibility of the methods for small-strain and finite-strain models are tested
A review of nonlinear FFT-based computational homogenization methods
Since their inception, computational homogenization methods based on the fast Fourier transform (FFT) have grown in popularity, establishing themselves as a powerful tool applicable to complex, digitized microstructures. At the same time, the understanding of the underlying principles has grown, in terms of both discretization schemes and solution methods, leading to improvements of the original approach and extending the applications. This article provides a condensed overview of results scattered throughout the literature and guides the reader to the current state of the art in nonlinear computational homogenization methods using the fast Fourier transform