29,804 research outputs found
A Subsampling Line-Search Method with Second-Order Results
In many contemporary optimization problems such as those arising in machine
learning, it can be computationally challenging or even infeasible to evaluate
an entire function or its derivatives. This motivates the use of stochastic
algorithms that sample problem data, which can jeopardize the guarantees
obtained through classical globalization techniques in optimization such as a
trust region or a line search. Using subsampled function values is particularly
challenging for the latter strategy, which relies upon multiple evaluations. On
top of that all, there has been an increasing interest for nonconvex
formulations of data-related problems, such as training deep learning models.
For such instances, one aims at developing methods that converge to
second-order stationary points quickly, i.e., escape saddle points efficiently.
This is particularly delicate to ensure when one only accesses subsampled
approximations of the objective and its derivatives.
In this paper, we describe a stochastic algorithm based on negative curvature
and Newton-type directions that are computed for a subsampling model of the
objective. A line-search technique is used to enforce suitable decrease for
this model, and for a sufficiently large sample, a similar amount of reduction
holds for the true objective. By using probabilistic reasoning, we can then
obtain worst-case complexity guarantees for our framework, leading us to
discuss appropriate notions of stationarity in a subsampling context. Our
analysis encompasses the deterministic regime, and allows us to identify
sampling requirements for second-order line-search paradigms. As we illustrate
through real data experiments, these worst-case estimates need not be satisfied
for our method to be competitive with first-order strategies in practice
Approximating Pareto frontier using a hybrid line search approach
This is the post-print version of the final paper published in Information Sciences. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2010 Elsevier B.V.The aggregation of objectives in multiple criteria programming is one of the simplest and widely used approach. But it is well known that this technique sometimes fail in different aspects for determining the Pareto frontier. This paper proposes a new approach for multicriteria optimization, which aggregates the objective functions and uses a line search method in order to locate an approximate efficient point. Once the first Pareto solution is obtained, a simplified version of the former one is used in the context of Pareto dominance to obtain a set of efficient points, which will assure a thorough distribution of solutions on the Pareto frontier. In the current form, the proposed technique is well suitable for problems having multiple objectives (it is not limited to bi-objective problems) and require the functions to be continuous twice differentiable. In order to assess the effectiveness of this approach, some experiments were performed and compared with two recent well known population-based metaheuristics namely ParEGO and NSGA II. When compared to ParEGO and NSGA II, the proposed approach not only assures a better convergence to the Pareto frontier but also illustrates a good distribution of solutions. From a computational point of view, both stages of the line search converge within a short time (average about 150 ms for the first stage and about 20 ms for the second stage). Apart from this, the proposed technique is very simple, easy to implement and use to solve multiobjective problems.CNCSIS IDEI 2412, Romani
Variable metric inexact line-search based methods for nonsmooth optimization
We develop a new proximal-gradient method for minimizing the sum of a
differentiable, possibly nonconvex, function plus a convex, possibly non
differentiable, function. The key features of the proposed method are the
definition of a suitable descent direction, based on the proximal operator
associated to the convex part of the objective function, and an Armijo-like
rule to determine the step size along this direction ensuring the sufficient
decrease of the objective function. In this frame, we especially address the
possibility of adopting a metric which may change at each iteration and an
inexact computation of the proximal point defining the descent direction. For
the more general nonconvex case, we prove that all limit points of the iterates
sequence are stationary, while for convex objective functions we prove the
convergence of the whole sequence to a minimizer, under the assumption that a
minimizer exists. In the latter case, assuming also that the gradient of the
smooth part of the objective function is Lipschitz, we also give a convergence
rate estimate, showing the O(1/k) complexity with respect to the function
values. We also discuss verifiable sufficient conditions for the inexact
proximal point and we present the results of a numerical experience on a convex
total variation based image restoration problem, showing that the proposed
approach is competitive with another state-of-the-art method
New modification of the hestenes-stiefel with strong wolfe line search
. The method of the nonlinear conjugate gradient is widely used in solving large-scale unconstrained optimization since been proven in solving optimization problems without using large memory storage. In this paper, we proposed a new modification of the Hestenes-Stiefel conjugate gradient parameter that fulfils the condition of sufficient descent using a strong Wolfe-Powell line search. Besides, the conjugate gradient method with the proposed conjugate gradient also guarantees low computation of iteration and CPU time by comparing with other classical conjugate gradient parameters. Numerical results have shown that the conjugate gradient method with the proposed conjugate gradient parameter performed better than the conjugate gradient method with other classical conjugate gradient parameters
Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations
In this paper, we propose a successive convex approximation framework for
sparse optimization where the nonsmooth regularization function in the
objective function is nonconvex and it can be written as the difference of two
convex functions. The proposed framework is based on a nontrivial combination
of the majorization-minimization framework and the successive convex
approximation framework proposed in literature for a convex regularization
function. The proposed framework has several attractive features, namely, i)
flexibility, as different choices of the approximate function lead to different
type of algorithms; ii) fast convergence, as the problem structure can be
better exploited by a proper choice of the approximate function and the
stepsize is calculated by the line search; iii) low complexity, as the
approximate function is convex and the line search scheme is carried out over a
differentiable function; iv) guaranteed convergence to a stationary point. We
demonstrate these features by two example applications in subspace learning,
namely, the network anomaly detection problem and the sparse subspace
clustering problem. Customizing the proposed framework by adopting the
best-response type approximation, we obtain soft-thresholding with exact line
search algorithms for which all elements of the unknown parameter are updated
in parallel according to closed-form expressions. The attractive features of
the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing,
special issue in Robust Subspace Learnin
- …
