7,451 research outputs found
Sub-Sampled Newton Methods I: Globally Convergent Algorithms
Large scale optimization problems are ubiquitous in machine learning and data
analysis and there is a plethora of algorithms for solving such problems. Many
of these algorithms employ sub-sampling, as a way to either speed up the
computations and/or to implicitly implement a form of statistical
regularization. In this paper, we consider second-order iterative optimization
algorithms and we provide bounds on the convergence of the variants of Newton's
method that incorporate uniform sub-sampling as a means to estimate the
gradient and/or Hessian. Our bounds are non-asymptotic and quantitative. Our
algorithms are global and are guaranteed to converge from any initial iterate.
Using random matrix concentration inequalities, one can sub-sample the
Hessian to preserve the curvature information. Our first algorithm incorporates
Hessian sub-sampling while using the full gradient. We also give additional
convergence results for when the sub-sampled Hessian is regularized by
modifying its spectrum or ridge-type regularization. Next, in addition to
Hessian sub-sampling, we also consider sub-sampling the gradient as a way to
further reduce the computational complexity per iteration. We use approximate
matrix multiplication results from randomized numerical linear algebra to
obtain the proper sampling strategy. In all these algorithms, computing the
update boils down to solving a large scale linear system, which can be
computationally expensive. As a remedy, for all of our algorithms, we also give
global convergence results for the case of inexact updates where such linear
system is solved only approximately.
This paper has a more advanced companion paper, [42], in which we demonstrate
that, by doing a finer-grained analysis, we can get problem-independent bounds
for local convergence of these algorithms and explore trade-offs to improve
upon the basic results of the present paper
Regularization: Convergence of Iterative Half Thresholding Algorithm
In recent studies on sparse modeling, the nonconvex regularization approaches
(particularly, regularization with ) have been demonstrated
to possess capability of gaining much benefit in sparsity-inducing and
efficiency. As compared with the convex regularization approaches (say,
regularization), however, the convergence issue of the corresponding algorithms
are more difficult to tackle. In this paper, we deal with this difficult issue
for a specific but typical nonconvex regularization scheme, the
regularization, which has been successfully used to many applications. More
specifically, we study the convergence of the iterative \textit{half}
thresholding algorithm (the \textit{half} algorithm for short), one of the most
efficient and important algorithms for solution to the
regularization. As the main result, we show that under certain conditions, the
\textit{half} algorithm converges to a local minimizer of the
regularization, with an eventually linear convergence rate. The established
result provides a theoretical guarantee for a wide range of applications of the
\textit{half} algorithm. We provide also a set of simulations to support the
correctness of theoretical assertions and compare the time efficiency of the
\textit{half} algorithm with other known typical algorithms for
regularization like the iteratively reweighted least squares (IRLS) algorithm
and the iteratively reweighted minimization (IRL1) algorithm.Comment: 12 pages, 5 figure
Sequential subspace optimization for nonlinear inverse problems
In this work we discuss a method to adapt sequential subspace optimization
(SESOP), which has so far been developed for linear inverse problems in Hilbert
and Banach spaces, to the case of nonlinear inverse problems. We start by
revising the well-known technique for Hilbert spaces. In a next step, we
introduce a method using multiple search directions that are especially
designed to fit the nonlinearity of the forward operator. To this end, we
iteratively project the initial value onto stripes whose shape is determined by
the search direction, the nonlinearity of the operator and the noise level. We
additionally propose a fast algorithm that uses two search directions. Finally
we will show convergence and regularization properties for the presented
method.Comment: 22 pages, no figure
A Gauss-Seidel Iterative Thresholding Algorithm for lq Regularized Least Squares Regression
In recent studies on sparse modeling, () regularized least
squares regression (LS) has received considerable attention due to its
superiorities on sparsity-inducing and bias-reduction over the convex
counterparts. In this paper, we propose a Gauss-Seidel iterative thresholding
algorithm (called GAITA) for solution to this problem. Different from the
classical iterative thresholding algorithms using the Jacobi updating rule,
GAITA takes advantage of the Gauss-Seidel rule to update the coordinate
coefficients. Under a mild condition, we can justify that the support set and
sign of an arbitrary sequence generated by GAITA will converge within finite
iterations. This convergence property together with the Kurdyka-{\L}ojasiewicz
property of (LS) naturally yields the strong convergence of GAITA under
the same condition as above, which is generally weaker than the condition for
the convergence of the classical iterative thresholding algorithms.
Furthermore, we demonstrate that GAITA converges to a local minimizer under
certain additional conditions. A set of numerical experiments are provided to
show the effectiveness, particularly, much faster convergence of GAITA as
compared with the classical iterative thresholding algorithms.Comment: 35 pages, 11 figure
Superiorization of EM Algorithm and Its Application in Single-Photon Emission Computed Tomography(SPECT)
In this paper, we presented an efficient algorithm to implement the
regularization reconstruction of SPECT. Image reconstruction with priori
assumptions is usually modeled as a constrained optimization problem. However,
there is no efficient algorithm to solve it due to the large scale of the
problem. In this paper, we used the superiorization of the expectation
maximization (EM) iteration to implement the regularization reconstruction of
SPECT. We first investigated the convergent conditions of the EM iteration in
the presence of perturbations. Secondly, we designed the superiorized EM
algorithm based on the convergent conditions, and then proposed a modified
version of it. Furthermore, we gave two methods to generate desired
perturbations for two special objective functions. Numerical experiments for
SPECT reconstruction were conducted to validate the performance of the proposed
algorithms. The experiments show that the superiorized EM algorithms are more
stable and robust for noised projection data and initial image than the classic
EM algorithm, and outperform the classic EM algorithm in terms of mean square
error and visual quality of the reconstructed images.Comment: some typos corrected, explanations for the phenomena of experiments
are give
On Accelerating the Regularized Alternating Least Square Algorithm for Tensors
In this paper, we discuss the acceleration of the regularized alternating
least square (RALS) algorithm for tensor approximation. We propose a fast
iterative method using a Aitken-Stefensen like updates for the regularized
algorithm. Through numerical experiments, the fast algorithm demonstrate a
faster convergence rate for the accelerated version in comparison to both the
standard and regularized alternating least squares algorithms. In addition, we
analyze the global convergence based on the Kurdyka- Lojasiewicz inequality as
well as show that the RALS algorithm has a linear local convergence rate
On Convergent Finite Difference Schemes for Variational - PDE Based Image Processing
We study an adaptive anisotropic Huber functional based image restoration
scheme. By using a combination of L2-L1 regularization functions, an adaptive
Huber functional based energy minimization model provides denoising with edge
preservation in noisy digital images. We study a convergent finite difference
scheme based on continuous piecewise linear functions and use a variable
splitting scheme, namely the Split Bregman, to obtain the discrete minimizer.
Experimental results are given in image denoising and comparison with additive
operator splitting, dual fixed point, and projected gradient schemes illustrate
that the best convergence rates are obtained for our algorithm.Comment: 23 pages, 12 figures, 2 table
Sub-Sampled Newton Methods II: Local Convergence Rates
Many data-fitting applications require the solution of an optimization
problem involving a sum of large number of functions of high dimensional
parameter. Here, we consider the problem of minimizing a sum of functions
over a convex constraint set where both
and are large. In such problems, sub-sampling as a way to reduce
can offer great amount of computational efficiency.
Within the context of second order methods, we first give quantitative local
convergence results for variants of Newton's method where the Hessian is
uniformly sub-sampled. Using random matrix concentration inequalities, one can
sub-sample in a way that the curvature information is preserved. Using such
sub-sampling strategy, we establish locally Q-linear and Q-superlinear
convergence rates. We also give additional convergence results for when the
sub-sampled Hessian is regularized by modifying its spectrum or Levenberg-type
regularization.
Finally, in addition to Hessian sub-sampling, we consider sub-sampling the
gradient as way to further reduce the computational complexity per iteration.
We use approximate matrix multiplication results from randomized numerical
linear algebra (RandNLA) to obtain the proper sampling strategy and we
establish locally R-linear convergence rates. In such a setting, we also show
that a very aggressive sample size increase results in a R-superlinearly
convergent algorithm.
While the sample size depends on the condition number of the problem, our
convergence rates are problem-independent, i.e., they do not depend on the
quantities related to the problem. Hence, our analysis here can be used to
complement the results of our basic framework from the companion paper, [38],
by exploring algorithmic trade-offs that are important in practice
Fast Statistical Iterative Reconstruction for MVCT in TomoTherapy
Statistical iterative reconstruction is expected to improve the image quality
of megavoltage computed tomography (MVCT). However, one of the challenges of
iterative reconstruction is its large computational cost. The purpose of this
work is to develop a fast iterative reconstruction algorithm by combining
several iterative techniques and by optimizing reconstruction parameters.
Megavolt projection data was acquired from a TomoTherapy system and
reconstructed using our statistical iterative reconstruction. Total variation
was used as the regularization term and the weight of the regularization term
was determined by evaluating signal-to-noise ratio (SNR), contrast-to-noise
ratio (CNR), and visual assessment of spatial resolution using Gammex and
Cheese phantoms. Gradient decent with an adaptive convergence parameter,
ordered subset expectation maximization (OSEM), and CPU/GPU parallelization
were applied in order to accelerate the present reconstruction algorithm. The
SNR and CNR of the iterative reconstruction were several times better than that
of filtered back projection (FBP). The GPU parallelization code combined with
the OSEM algorithm reconstructed an image several hundred times faster than a
CPU calculation. With 500 iterations, which provided good convergence, our
method produced a 512512 pixel image within a few seconds. The image
quality of the present algorithm was much better than that of FBP for patient
data. An image from the iterative reconstruction in TomoTherapy can be obtained
within few seconds by fine-tuning the parameters. The iterative reconstruction
with GPU was fast enough for clinical use, and largely improve the MVCT images.Comment: 11 pages, 4 figure
A Regularized Semi-Smooth Newton Method With Projection Steps for Composite Convex Programs
The goal of this paper is to study approaches to bridge the gap between
first-order and second-order type methods for composite convex programs. Our
key observations are: i) Many well-known operator splitting methods, such as
forward-backward splitting (FBS) and Douglas-Rachford splitting (DRS), actually
define a fixed-point mapping; ii) The optimal solutions of the composite convex
program and the solutions of a system of nonlinear equations derived from the
fixed-point mapping are equivalent. Solving this kind of system of nonlinear
equations enables us to develop second-order type methods. Although these
nonlinear equations may be non-differentiable, they are often semi-smooth and
their generalized Jacobian matrix is positive semidefinite due to monotonicity.
By combining with a regularization approach and a known hyperplane projection
technique, we propose an adaptive semi-smooth Newton method and establish its
convergence to global optimality. Preliminary numerical results on
-minimization problems demonstrate that our second-order type
algorithms are able to achieve superlinear or quadratic convergence.Comment: 25 pages, 4 figure
- β¦