29,804 research outputs found

    A Subsampling Line-Search Method with Second-Order Results

    Full text link
    In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization such as a trust region or a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. On top of that all, there has been an increasing interest for nonconvex formulations of data-related problems, such as training deep learning models. For such instances, one aims at developing methods that converge to second-order stationary points quickly, i.e., escape saddle points efficiently. This is particularly delicate to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model, and for a sufficiently large sample, a similar amount of reduction holds for the true objective. By using probabilistic reasoning, we can then obtain worst-case complexity guarantees for our framework, leading us to discuss appropriate notions of stationarity in a subsampling context. Our analysis encompasses the deterministic regime, and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice

    Approximating Pareto frontier using a hybrid line search approach

    Get PDF
    This is the post-print version of the final paper published in Information Sciences. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2010 Elsevier B.V.The aggregation of objectives in multiple criteria programming is one of the simplest and widely used approach. But it is well known that this technique sometimes fail in different aspects for determining the Pareto frontier. This paper proposes a new approach for multicriteria optimization, which aggregates the objective functions and uses a line search method in order to locate an approximate efficient point. Once the first Pareto solution is obtained, a simplified version of the former one is used in the context of Pareto dominance to obtain a set of efficient points, which will assure a thorough distribution of solutions on the Pareto frontier. In the current form, the proposed technique is well suitable for problems having multiple objectives (it is not limited to bi-objective problems) and require the functions to be continuous twice differentiable. In order to assess the effectiveness of this approach, some experiments were performed and compared with two recent well known population-based metaheuristics namely ParEGO and NSGA II. When compared to ParEGO and NSGA II, the proposed approach not only assures a better convergence to the Pareto frontier but also illustrates a good distribution of solutions. From a computational point of view, both stages of the line search converge within a short time (average about 150 ms for the first stage and about 20 ms for the second stage). Apart from this, the proposed technique is very simple, easy to implement and use to solve multiobjective problems.CNCSIS IDEI 2412, Romani

    Variable metric inexact line-search based methods for nonsmooth optimization

    Get PDF
    We develop a new proximal-gradient method for minimizing the sum of a differentiable, possibly nonconvex, function plus a convex, possibly non differentiable, function. The key features of the proposed method are the definition of a suitable descent direction, based on the proximal operator associated to the convex part of the objective function, and an Armijo-like rule to determine the step size along this direction ensuring the sufficient decrease of the objective function. In this frame, we especially address the possibility of adopting a metric which may change at each iteration and an inexact computation of the proximal point defining the descent direction. For the more general nonconvex case, we prove that all limit points of the iterates sequence are stationary, while for convex objective functions we prove the convergence of the whole sequence to a minimizer, under the assumption that a minimizer exists. In the latter case, assuming also that the gradient of the smooth part of the objective function is Lipschitz, we also give a convergence rate estimate, showing the O(1/k) complexity with respect to the function values. We also discuss verifiable sufficient conditions for the inexact proximal point and we present the results of a numerical experience on a convex total variation based image restoration problem, showing that the proposed approach is competitive with another state-of-the-art method

    New modification of the hestenes-stiefel with strong wolfe line search

    Get PDF
    . The method of the nonlinear conjugate gradient is widely used in solving large-scale unconstrained optimization since been proven in solving optimization problems without using large memory storage. In this paper, we proposed a new modification of the Hestenes-Stiefel conjugate gradient parameter that fulfils the condition of sufficient descent using a strong Wolfe-Powell line search. Besides, the conjugate gradient method with the proposed conjugate gradient also guarantees low computation of iteration and CPU time by comparing with other classical conjugate gradient parameters. Numerical results have shown that the conjugate gradient method with the proposed conjugate gradient parameter performed better than the conjugate gradient method with other classical conjugate gradient parameters

    Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations

    Full text link
    In this paper, we propose a successive convex approximation framework for sparse optimization where the nonsmooth regularization function in the objective function is nonconvex and it can be written as the difference of two convex functions. The proposed framework is based on a nontrivial combination of the majorization-minimization framework and the successive convex approximation framework proposed in literature for a convex regularization function. The proposed framework has several attractive features, namely, i) flexibility, as different choices of the approximate function lead to different type of algorithms; ii) fast convergence, as the problem structure can be better exploited by a proper choice of the approximate function and the stepsize is calculated by the line search; iii) low complexity, as the approximate function is convex and the line search scheme is carried out over a differentiable function; iv) guaranteed convergence to a stationary point. We demonstrate these features by two example applications in subspace learning, namely, the network anomaly detection problem and the sparse subspace clustering problem. Customizing the proposed framework by adopting the best-response type approximation, we obtain soft-thresholding with exact line search algorithms for which all elements of the unknown parameter are updated in parallel according to closed-form expressions. The attractive features of the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing, special issue in Robust Subspace Learnin
    corecore