48 research outputs found

    Nonmonotone Barzilai-Borwein Gradient Algorithm for 1\ell_1-Regularized Nonsmooth Minimization in Compressive Sensing

    Full text link
    This paper is devoted to minimizing the sum of a smooth function and a nonsmooth 1\ell_1-regularized term. This problem as a special cases includes the 1\ell_1-regularized convex minimization problem in signal processing, compressive sensing, machine learning, data mining, etc. However, the non-differentiability of the 1\ell_1-norm causes more challenging especially in large problems encountered in many practical applications. This paper proposes, analyzes, and tests a Barzilai-Borwein gradient algorithm. At each iteration, the generated search direction enjoys descent property and can be easily derived by minimizing a local approximal quadratic model and simultaneously taking the favorable structure of the 1\ell_1-norm. Moreover, a nonmonotone line search technique is incorporated to find a suitable stepsize along this direction. The algorithm is easily performed, where the values of the objective function and the gradient of the smooth term are required at per-iteration. Under some conditions, the proposed algorithm is shown to be globally convergent. The limited experiments by using some nonconvex unconstrained problems from CUTEr library with additive 1\ell_1-regularization illustrate that the proposed algorithm performs quite well. Extensive experiments for 1\ell_1-regularized least squares problems in compressive sensing verify that our algorithm compares favorably with several state-of-the-art algorithms which are specifically designed in recent years.Comment: 20 page

    SPECTRAL PROJECTED GRADIENT METHOD WITH INEXACT RESTORATION FOR MINIMIZATION WITH NONCONVEX CONSTRAINTS

    Get PDF
    Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)This work takes advantage of the spectral projected gradient direction within the inexact restoration framework to address nonlinear optimization problems with nonconvex constraints. The proposed strategy includes a convenient handling of the constraints, together with nonmonotonic features to speed up convergence. The numerical performance is assessed by experiments with hard-spheres problems, pointing out that the inexact restoration framework provides an adequate environment for the extension of the spectral projected gradient method for general nonlinearly constrained optimization.31316281652Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)CNPq [E-26/171.164/2003 - APQ1]FAPESP [01/04597-4, 06/53768-0

    A Feasible Method for Optimization with Orthogonality Constraints

    Get PDF
    Minimization with orthogonality constraints (e.g., X'X = I) and/or spherical constraints (e.g., ||x||_2 = 1) has wide applications in polynomial optimization, combinatorial optimization, eigenvalue problems, sparse PCA, p-harmonic flows, 1-bit compressive sensing, matrix rank minimization, etc. These problems are difficult because the constraints are not only non-convex but numerically expensive to preserve during iterations. To deal with these difficulties, we propose to use a Crank-Nicholson-like update scheme to preserve the constraints and based on it, develop curvilinear search algorithms with lower per-iteration cost compared to those based on projections and geodesics. The efficiency of the proposed algorithms is demonstrated on a variety of test problems. In particular, for the maxcut problem, it exactly solves a decomposition formulation for the SDP relaxation. For polynomial optimization, nearest correlation matrix estimation and extreme eigenvalue problems, the proposed algorithms run very fast and return solutions no worse than those from their state-of-the-art algorithms. For the quadratic assignment problem, a gap 0.842% to the best known solution on the largest problem "256c" in QAPLIB can be reached in 5 minutes on a typical laptop

    New bundle methods and U-Lagrangian for generic nonsmooth optimization

    Get PDF
    Nonsmooth optimization consists of minimizing a continuous function by systematically choosing iterative points from the feasible set via the computation of function values and generalized gradients (called subgradients). Broadly speaking, this thesis contains two research themes: nonsmooth optimization algorithms and theories about the substructure of special nonsmooth functions. Specifically, in terms of algorithms, we develop new bundle methods and bundle trust region methods for generic nonsmooth optimization. For theoretical work, we generalize the notion of U-Lagrangian and investigate its connections with some subsmooth structures. This PhD project develops trust region methods for generic nonsmooth optimization. It assumes the functions are Lipschitz continuous and the optimization problem is not necessarily convex. Currently the project also assumes the objective function is prox-regular but no structural information is given. Trust region methods create a local model of the problem in a neighborhood of the iteration point (called the `Trust Region'). They minimize the model over the Trust Region and consider the minimizer as a trial point for next iteration. If the model is an appropriate approximation of the objective function then the trial point is expected to generate function reduction. The model problem is usually easy to solve. Therefore by comparing the reduction of the model's value and that of the real problem, trust region methods adjust the radius of the trust region to continue to obtain reduction by solving model problems. At the end of this project, it is clear that (1) It is possible to develop a pure bundle method with linear subproblems and without trust region update for convex optimization problems; such method converges to minimizers if it generates an infinite sequence of serious steps; otherwise, it can be shown that the method generates a sequence of minor updates and the last serious step is a minimizer. First, this PhD project develops a bundle trust region algorithm with linear model and linear subproblem for minimizing a prox-regular and Lipschitz function. It adopts a convexification technique from the redistributed bundle method. Global convergence of the algorithm is established in the sense that the sequence of iterations converges to the fixed point of the proximal-point mapping given that convexification is successful. Preliminary numerical tests on standard academic nonsmooth problems show that the algorithm is comparable to bundle methods with quadratic subproblem. Second, following the philosophy behind bundle method of making full use of the previous information of the iteration process and obtaining a flexible understanding of the function structure, the project revises the algorithm developed in the first part by applying the nonmonotone trust region method.We study the performance of numerical implementation and successively refine the algorithm in an effort to improve its practical performance. Such revisions include allowing the convexification parameter to possibly decrease and the algorithm to restart after a finite process determined by various heuristics. The second theme of this project is about the theories of nonsmooth analysis, focusing on U-Lagrangian. When restricted to a subspace, a nonsmooth function can be differentiable within this space. It is known that for a nonsmooth convex function, at a point, the Euclidean space can be decomposed into two subspaces: U, over which a special Lagrangian (called the U-Lagrangian) can be defined and has nice smooth properties and V space, the orthogonal complement subspace of the U space. In this thesis we generalize the definition of UV-decomposition and U-Lagrangian to the context of nonconvex functions, specifically that of a prox-regular function. Similar work in the literature includes a quadratic sub-Lagrangian. It is our interest to study the feasibility of a linear localized U-Lagrangian. We also study the connections of the new U-Lagrangian and other subsmooth structures including fast tracks and partial smooth functions. This part of the project tries to provide answers to the following questions: (1) based on a generalized UV-decomposition, can we develop a linear U-Lagrangian of a prox-regular function that maintains prox-regularity? (2) through the new U-Lagrangian can we show that partial smoothness and fast tracks are equivalent under prox-regularity? At the end of this project, it is clear that for a function f that is properly prox-regular at a point x*, a new linear localized U-Lagrangian can be defined and its value at 0 coincides with f(x*); under some conditions, it can be proved that the U-Lagrangian is also prox-regular at 0; moreover partial smoothness and fast tracks are equivalent under prox-regularity and other mild conditions

    Modifikacije metoda NJutnovog tipa za rešavanje semi-glatkih problema stohastičke optimizacije

    Get PDF
     In numerous optimization problems originating from real-world and scientific applications, we often face nonsmoothness. A large number of problems belong to this class, from models of natural phenomena that exhibit sudden changes, shape optimization, to hinge loss functions in machine learning and deep neural networks. In practice, solving a on smooth convex problem tends to be more challenging, usually more difficult and costly than a smooth one. The aim of this thesis is the formulation and theoretical analysis of Newton-type algorithms for solving nonsmooth convex stochastic optimization problems. The optimization problems with the objective function given in the form of a mathematical expectation without differentiability assumption of the function are considered. The Sample Average Approximation (SAA) is used to estimate the objective function. As the accuracy of the SAA objective functions and its derivatives is naturally proportional to the computational costs – higher precision implies larger costs in general, it is important to design an efficient balance between accuracy and costs. Therefore, the main focus of this thesis is the development of adaptive sample size control algorithms in a nonsmooth environment, with particular attention given to the control of the accuracy and selection of search directions. Several options are investigated for the search direction, while the accuracy control involves cheaper objective function approximations (with looser accuracy) during the initial stages of the process to save computational effort. This approach aims to conserve computational resources, reserving the deployment of high-accuracy objective function approximations for the final stages of the optimization process. A detailed description of the proposed methods is presented in Chapter 5 and 6. Also, the theoretical properties of the numerical procedures are analyzed, i.e., their convergence is proved, and the complexity of the developed methods is studied. In addition to the theoretical framework, the successful practical implementation of the given algorithms is presented. It is shown that the proposed methods are more efficient in practical application compared to the existing methods from the literature. Chapter 1 of this thesis serves as a foundation for the subsequent chapters by providing the necessary background information. Chapter 2 covers the fundamentals of nonlinear optimization, with a particular emphasis on line search techniques. In Chapter 3, the focus shifts to the nonsmooth framework. This chapter serves the purpose of reviewing the existing knowledge and established results in the field. The remaining sections of the thesis, starting from Chapter 4, where the framework for the subject of this thesis (the minimization of the expected value function) is introduced, onwards, represent the original contribution made by the author.У бројним проблемима оптимизације који потичу из стварних и научних примена, често се суочавамо са недиференцијабилношћу. У ову класу спада велики број проблема, од модела природних феномена који показују нагле промене, оптимизације облика, до функције циља у машинском учењу и дубоким неуронским мрежама. У пракси, решавање семи-глатких конвексних проблема обично је изазовније и захтева веће рачунске трошкове у односу на глатке проблеме. Циљ ове тезе је формулација и теоријска анализа метода Њутновог типа за решавање семи-глатких конвексних стохастичких проблема оптимизације. Разматрани су проблеми оптимизације са функцијом циља датом у облику математичког очекивања без претпоставке о диференцијабилности функције. Како је врло тешко, па некад чак и немогуће одредити аналитички облик математичког очекивања, функција циља се апроксимира узорачким очекивањем. Имајући у виду да је тачност апроксимације функције циља и њених извода пропорционална рачунским трошковима – већа прецизност подразумева веће трошкове у општем случају, важно је дизајнирати ефикасан баланс између тачности и трошкова. Стога, главни фокус ове тезе је развојалгоритама базираних на одређивању оптималне динамике увећања узорка у семи-глатком окружењу, са посебном пажњом на контроли тачности и одабиру праваца претраге. По питању одабира правца, размотрено је неколико опција, док контрола тачности укључује јефтиније апроксимације функције циља (са мањом прецизношћу) током почетних фаза процеса да би се уштедели рачунски напори. Овај приступ има за циљ очување рачунских ресурса, резервишући примену апроксимација функције циља високе тачности за завршне фазе процеса оптимизације. Детаљан опис предложених метода представљен је у поглављима 5 и 6, где су анализиране и теоријске особине нумеричких поступака, тј. доказана је њихова конвергенција и приказана сложеност развијених метода. Поред теоријског оквира, потврђена је успешна практична имплементација датих алгоритама. Показано је да су предложене методе ефикасније у практичној примени у односу на постојеће методе из литературе. Поглавље 1 ове тезе служи као основа за праћење наредних поглавља пружајући преглед основних појмова. Поглавље 2 се односи на нелинеарну оптимизацију, при чему је посебан акценат стављен на технике линијског претраживања. У поглављу 3 фокус се помера на семи-глатке проблеме оптимизације и методе за њихово решавање и служи као преглед постојећих резултата из ове области. Преостали делови тезе, почевши од поглавља 4, где се уводи проблем изучавања ове тезе (минимизација функције дате у облику очекиване вредности), па надаље, представљају оригинални допринос аутора.U brojnim problemima optimizacije koji potiču iz stvarnih i naučnih primena, često se suočavamo sa nediferencijabilnošću. U ovu klasu spada veliki broj problema, od modela prirodnih fenomena koji pokazuju nagle promene, optimizacije oblika, do funkcije cilja u mašinskom učenju i dubokim neuronskim mrežama. U praksi, rešavanje semi-glatkih konveksnih problema obično je izazovnije i zahteva veće računske troškove u odnosu na glatke probleme. Cilj ove teze je formulacija i teorijska analiza metoda NJutnovog tipa za rešavanje semi-glatkih konveksnih stohastičkih problema optimizacije. Razmatrani su problemi optimizacije sa funkcijom cilja datom u obliku matematičkog očekivanja bez pretpostavke o diferencijabilnosti funkcije. Kako je vrlo teško, pa nekad čak i nemoguće odrediti analitički oblik matematičkog očekivanja, funkcija cilja se aproksimira uzoračkim očekivanjem. Imajući u vidu da je tačnost aproksimacije funkcije cilja i njenih izvoda proporcionalna računskim troškovima – veća preciznost podrazumeva veće troškove u opštem slučaju, važno je dizajnirati efikasan balans između tačnosti i troškova. Stoga, glavni fokus ove teze je razvojalgoritama baziranih na određivanju optimalne dinamike uvećanja uzorka u semi-glatkom okruženju, sa posebnom pažnjom na kontroli tačnosti i odabiru pravaca pretrage. Po pitanju odabira pravca, razmotreno je nekoliko opcija, dok kontrola tačnosti uključuje jeftinije aproksimacije funkcije cilja (sa manjom preciznošću) tokom početnih faza procesa da bi se uštedeli računski napori. Ovaj pristup ima za cilj očuvanje računskih resursa, rezervišući primenu aproksimacija funkcije cilja visoke tačnosti za završne faze procesa optimizacije. Detaljan opis predloženih metoda predstavljen je u poglavljima 5 i 6, gde su analizirane i teorijske osobine numeričkih postupaka, tj. dokazana je njihova konvergencija i prikazana složenost razvijenih metoda. Pored teorijskog okvira, potvrđena je uspešna praktična implementacija datih algoritama. Pokazano je da su predložene metode efikasnije u praktičnoj primeni u odnosu na postojeće metode iz literature. Poglavlje 1 ove teze služi kao osnova za praćenje narednih poglavlja pružajući pregled osnovnih pojmova. Poglavlje 2 se odnosi na nelinearnu optimizaciju, pri čemu je poseban akcenat stavljen na tehnike linijskog pretraživanja. U poglavlju 3 fokus se pomera na semi-glatke probleme optimizacije i metode za njihovo rešavanje i služi kao pregled postojećih rezultata iz ove oblasti. Preostali delovi teze, počevši od poglavlja 4, gde se uvodi problem izučavanja ove teze (minimizacija funkcije date u obliku očekivane vrednosti), pa nadalje, predstavljaju originalni doprinos autora
    corecore