67 research outputs found

    A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems

    Full text link
    We consider a class of nonconvex nonsmooth optimization problems whose objective is the sum of a smooth function and a finite number of nonnegative proper closed possibly nonsmooth functions (whose proximal mappings are easy to compute), some of which are further composed with linear maps. This kind of problems arises naturally in various applications when different regularizers are introduced for inducing simultaneous structures in the solutions. Solving these problems, however, can be challenging because of the coupled nonsmooth functions: the corresponding proximal mapping can be hard to compute so that standard first-order methods such as the proximal gradient algorithm cannot be applied efficiently. In this paper, we propose a successive difference-of-convex approximation method for solving this kind of problems. In this algorithm, we approximate the nonsmooth functions by their Moreau envelopes in each iteration. Making use of the simple observation that Moreau envelopes of nonnegative proper closed functions are continuous {\em difference-of-convex} functions, we can then approximately minimize the approximation function by first-order methods with suitable majorization techniques. These first-order methods can be implemented efficiently thanks to the fact that the proximal mapping of {\em each} nonsmooth function is easy to compute. Under suitable assumptions, we prove that the sequence generated by our method is bounded and any accumulation point is a stationary point of the objective. We also discuss how our method can be applied to concrete applications such as nonconvex fused regularized optimization problems and simultaneously structured matrix optimization problems, and illustrate the performance numerically for these two specific applications

    Convergence Properties of Monotone and Nonmonotone Proximal Gradient Methods Revisited

    Full text link
    Composite optimization problems, where the sum of a smooth and a merely lower semicontinuous function has to be minimized, are often tackled numerically by means of proximal gradient methods as soon as the lower semicontinuous part of the objective function is of simple enough structure. The available convergence theory associated with these methods (mostly) requires the derivative of the smooth part of the objective function to be (globally) Lipschitz continuous, and this might be a restrictive assumption in some practically relevant scenarios. In this paper, we readdress this classical topic and provide convergence results for the classical (monotone) proximal gradient method and one of its nonmonotone extensions which are applicable in the absence of (strong) Lipschitz assumptions. This is possible since, for the price of forgoing convergence rates, we omit the use of descent-type lemmas in our analysis.Comment: 23 page

    A two-phase gradient method for quadratic programming problems with a single linear constraint and bounds on the variables

    Full text link
    We propose a gradient-based method for quadratic programming problems with a single linear constraint and bounds on the variables. Inspired by the GPCG algorithm for bound-constrained convex quadratic programming [J.J. Mor\'e and G. Toraldo, SIAM J. Optim. 1, 1991], our approach alternates between two phases until convergence: an identification phase, which performs gradient projection iterations until either a candidate active set is identified or no reasonable progress is made, and an unconstrained minimization phase, which reduces the objective function in a suitable space defined by the identification phase, by applying either the conjugate gradient method or a recently proposed spectral gradient method. However, the algorithm differs from GPCG not only because it deals with a more general class of problems, but mainly for the way it stops the minimization phase. This is based on a comparison between a measure of optimality in the reduced space and a measure of bindingness of the variables that are on the bounds, defined by extending the concept of proportioning, which was proposed by some authors for box-constrained problems. If the objective function is bounded, the algorithm converges to a stationary point thanks to a suitable application of the gradient projection method in the identification phase. For strictly convex problems, the algorithm converges to the optimal solution in a finite number of steps even in case of degeneracy. Extensive numerical experiments show the effectiveness of the proposed approach.Comment: 30 pages, 17 figure

    Negativna selekcija - Apsolutna mera algoritamskog izvršenja proizvoljnog naloga

    No full text
    Algorithmic trading is an automated process of order execution on electronic stock markets. It can be applied to a broad range of financial instruments, and it is  characterized by a signicant investors' control over the execution of his/her orders, with the principal goal of finding the right balance between costs and risk of not (fully) executing an order. As the measurement of execution performance gives information whether best execution is achieved, a signicant number of diffeerent benchmarks is  used in practice. The most frequently used are price benchmarks, where some of them are determined before trading (Pre-trade benchmarks), some during the trading  day (In-traday benchmarks), and some are determined after the trade (Post-trade benchmarks). The two most dominant are VWAP and Arrival Price, which is along with other pre-trade price benchmarks known as the Implementation Shortfall (IS). We introduce Negative Selection as a posteriori measure of the execution algorithm performance. It is based on the concept of Optimal Placement, which represents the ideal order that could be executed in a given time win-dow, where the notion of ideal means that it is an order with the best execution price considering  market  conditions  during the time window. Negative Selection is dened as a difference between vectors of optimal and executed orders, with vectors dened as a quantity of shares at specied price positionsin the order book. It is equal to zero when the order is optimally executed; negative if the order is not (completely) filled, and positive if the order is executed but at an unfavorable price. Negative Selection is based on the idea to offer a new, alternative performance measure, which will enable us to find the  optimal trajectories and construct optimal execution of an order. The first chapter of the thesis includes a list of notation and an overview of denitions and theorems that will be used further in the thesis. Chapters 2 and 3 follow with a  theoretical overview of concepts related to market microstructure, basic information regarding benchmarks, and theoretical background of algorithmic trading. Original results are presented in chapters 4 and 5. Chapter 4 includes a construction of optimal placement, definition and properties of Negative Selection. The results regarding the properties of a Negative Selection are given in [35]. Chapter 5 contains the theoretical background for stochastic optimization, a model of the optimal execution formulated as a stochastic optimization problem with regard to Negative Selection, as well as original work on nonmonotone line search method [31], while numerical results are in the last, 6th chapter.Algoritamsko trgovanje je automatizovani proces izvršavanja naloga na elektronskim berzama. Može se primeniti na širok spektar nansijskih instrumenata kojima se trguje na berzi i karakteriše ga značajna kontrola investitora nad izvršavanjem njegovih naloga, pri čemu se teži nalaženju pravog balansa izmedu troška i rizika u vezi sa izvršenjem naloga. S ozirom da se merenjem performasi izvršenja naloga određuje da li je postignuto najbolje izvršenje, u praksi postoji značajan broj različitih pokazatelja. Najčešće su to pokazatelji cena, neki od njih se određuju pre trgovanja (eng. Pre-trade), neki u toku trgovanja (eng. Intraday), a neki nakon trgovanja (eng. Post-trade). Dva najdominantnija pokazatelja cena su VWAP i Arrival Price koji je zajedno sa ostalim "pre-trade" pokazateljima cena poznat kao Implementation shortfall (IS). Pojam negative selekcije se uvodi kao "post-trade" mera performansi algoritama izvršenja, polazeći od pojma optimalnog naloga, koji predstavlja idealni nalog koji se  mogao izvrsiti u datom vremenskom intervalu, pri ćemu se pod pojmom "idealni" podrazumeva nalog kojim se postiže najbolja cena u tržišnim uslovima koji su vladali  u toku tog vremenskog intervala. Negativna selekcija se definiše kao razlika vektora optimalnog i izvršenog naloga, pri čemu su vektori naloga defisani kao količine akcija na odgovarajućim pozicijama cena knjige naloga. Ona je jednaka nuli kada je nalog optimalno izvršen; negativna, ako nalog nije (u potpunosti) izvršen, a pozitivna ako je nalog izvršen, ali po nepovoljnoj ceni. Uvođenje mere negativne selekcije zasnovano je na ideji da se ponudi nova, alternativna, mera performansi i da se u odnosu na nju nađe optimalna trajektorija i konstruiše optimalno izvršenje naloga. U prvom poglavlju teze dati su lista notacija kao i pregled definicija i teorema  neophodnih za izlaganje materije. Poglavlja 2 i 3 bave se teorijskim pregledom pojmova i literature u vezi sa mikrostrukturom tržišta, pokazateljima trgovanja i algoritamskim trgovanjem. Originalni rezultati su predstavljeni u 4. i 5. poglavlju. Poglavlje 4 sadrži konstrukciju optimalnog naloga, definiciju i osobine negativne selekcije. Teorijski i praktični rezultati u vezi sa osobinama negativna selekcije dati su u [35]. Poglavlje 5 sadrži teorijske osnove stohastičke optimizacije, definiciju modela za optimalno izvršenje, kao i originalni rad u vezi sa metodom nemonotonog linijskog pretraživanja [31], dok 6. poglavlje sadrži empirijske rezultate

    Existence and solution methods for equilibria

    Get PDF
    Equilibrium problems provide a mathematical framework which includes optimization, variational inequalities, fixed-point and saddle point problems, and noncooperative games as particular cases. This general format received an increasing interest in the last decade mainly because many theoretical and algorithmic results developed for one of these models can be often extended to the others through the unifying language provided by this common format. This survey paper aims at covering the main results concerning the existence of equilibria and the solution methods for finding them

    Limited Memory Steepest Descent Methods for Nonlinear Optimization

    Get PDF
    This dissertation concerns the development of limited memory steepest descent (LMSD) methods for solving unconstrained nonlinear optimization problems. In particular, we focus on the class of LMSD methods recently proposed by Fletcher, which he has shown to be competitive with well-known quasi-Newton methods such as L-BFGS. However, in the design of such methods, much work remains to be done. First of all, Fletcher only showed a convergence result for LMSD methods when minimizing strongly convex quadratics, but no convergence rate result. In addition, his method mainly focused on minimizing strongly convex quadratics and general convex objectives, while when it comes to nonconvex objectives, open questions remain about how to effectively deal with nonpositive curvature. Furthermore, Fletcher\u27s method relies on having access to exact gradients, which can be a limitation when computing exact gradients is too expensive. The focus of this dissertation is the design and analysis of algorithms intended to solve these issues.In the first part of the new results in this dissertation, a convergence rate result for an LMSD method is proved. For context, we note that a basic LMSD method is an extension of the Barzilai-Borwein ``two-point stepsize\u27\u27 strategy for steepest descent methods for solving unconstrained optimization problems. It is known that the Barzilai-Borwein strategy yields a method with an R-linear rate of convergence when it is employed to minimize a strongly convex quadratic. Our contribution is to extend this analysis for LMSD, also for strongly convex quadratics. In particular, it is shown that, under reasonable assumptions, the method is R-linearly convergent for any choice of the history length parameter. The results of numerical experiments are also provided to illustrate behaviors of the method that are revealed through the theoretical analysis.The second part proposes an LMSD method for solving unconstrained nonconvex optimization problems. As a steepest descent method, the step computation in each iteration only requires the evaluation of a gradient of the objective function and the calculation of a scalar stepsize. When employed to solve certain convex problems, our method reduces to a variant of LMSD method proposed by Fletcher, which means that, when the history length parameter is set to one, it reduces to a steepest descent method inspired by that proposed by Barzilai and Borwein. However, our method is novel in that we propose new algorithmic features for cases when nonpositive curvature is encountered. That is, our method is particularly suited for solving nonconvex problems. With a nonmonotone line search, we ensure global convergence for a variant of our method. We also illustrate with numerical experiments that our approach often yields superior performance when employed to solve nonconvex problems.In the third part, we propose a limited memory stochastic gradient (LMSG) method for solving optimization problems arising in machine learning. As a start, we focus on problems that are strongly convex. When the dataset is too large such that the computation of full gradients is too expensive, our method computes stepsizes and iterates based on (mini-batch) stochastic gradients. Although in stochastic gradient (SG) methods, a best-tuned fixed stepsize or diminishing stepsize is most widely used, it can be inefficient in practice. Our method adopts a cubic model and always guarantees a positive meaningful stepsize, even when nonpositive curvature is encountered (which can happen when using stochastic gradients, even when the problem is convex). Our approach is based on the LMSD method with cubic regularization proposed in the second part of this dissertation. With a projection of stepsizes, we ensure convergence to a neighborhood of the optimal solution when the interval is fixed and convergence to the optimal solution when the interval is diminishing. We also illustrate with numerical experiments that our approach can outperform an SG method with a fixed stepsize

    A distributionally robust index tracking model with the CVaR penalty: tractable reformulation

    Full text link
    We propose a distributionally robust index tracking model with the conditional value-at-risk (CVaR) penalty. The model combines the idea of distributionally robust optimization for data uncertainty and the CVaR penalty to avoid large tracking errors. The probability ambiguity is described through a confidence region based on the first-order and second-order moments of the random vector involved. We reformulate the model in the form of a min-max-min optimization into an equivalent nonsmooth minimization problem. We further give an approximate discretization scheme of the possible continuous random vector of the nonsmooth minimization problem, whose objective function involves the maximum of numerous but finite nonsmooth functions. The convergence of the discretization scheme to the equivalent nonsmooth reformulation is shown under mild conditions. A smoothing projected gradient (SPG) method is employed to solve the discretization scheme. Any accumulation point is shown to be a global minimizer of the discretization scheme. Numerical results on the NASDAQ index dataset from January 2008 to July 2023 demonstrate the effectiveness of our proposed model and the efficiency of the SPG method, compared with several state-of-the-art models and corresponding methods for solving them

    Metodi linijskog pretrazivanja sa promenljivom velicinom uzorka

    Get PDF
    The problem under consideration is an unconstrained optimization problem with the objective function in the form of mathematical ex-pectation. The expectation is with respect to the random variable that represents the uncertainty. Therefore, the objective  function is in fact deterministic. However, nding the analytical form of that objective function can be very dicult or even impossible. This is the reason why the sample average approximation is often used. In order to obtain reasonable good approximation of the objective function, we have to use relatively large sample size. We assume that the sample is generated at the beginning of the optimization process and therefore we can consider this sample average objective function as the deterministic one. However, applying some deterministic method on that sample average function from the start can be very costly. The number of evaluations of the function under expectation is a common way of measuring the cost of an algorithm. Therefore, methods that vary the sample size throughout the optimization process are developed. Most of them are trying to determine the optimal dynamics of increasing the sample size. The main goal of this thesis is to develop the clas of methods that can decrease the cost of an algorithm by decreasing the number of function evaluations. The idea is to decrease the sample size whenever it seems to be reasonable - roughly speaking, we do not want to impose a large precision, i.e. a large sample size when we are far away from the solution we search for. The detailed description of the new methods  is presented in Chapter 4 together with the convergence analysis. It is shown that the approximate solution is of the same quality as the one obtained by dealing with the full sample from the start. Another important characteristic of the methods that are proposed here is the line search technique which is used for obtaining the sub-sequent iterates. The idea is to nd a suitable direction and to search along it until we obtain a sucient decrease in the  function value. The sucient decrease is determined throughout the line search rule. In Chapter 4, that rule is supposed to be monotone, i.e. we are imposing strict decrease of the function value. In order to decrease the cost of the algorithm even more and to enlarge the set of suitable search directions, we use nonmonotone line search rules in Chapter 5. Within that chapter, these rules are modied to t the variable sample size framework. Moreover, the conditions for the global convergence and the R-linear rate are presented.  In Chapter 6, numerical results are presented. The test problems are various - some of them are academic and some of them are real world problems. The academic problems are here to give us more insight into the behavior of the algorithms. On the other hand, data that comes from the real world problems are here to test the real applicability of the proposed algorithms. In the rst part of that chapter, the focus is on the variable sample size techniques. Different implementations of the proposed algorithm are compared to each other and to the other sample schemes as well. The second part is mostly devoted to the comparison of the various line search rules combined with dierent search directions in the variable sample size framework. The overall numerical results show that using the variable sample size can improve the performance of the algorithms signicantly, especially when the nonmonotone line search rules are used. The rst chapter of this thesis provides the background material for the subsequent chapters. In Chapter 2, basics of the nonlinear optimization are presented and the focus is on the line search, while Chapter 3 deals with the stochastic framework. These chapters are here to provide the review of the relevant known results, while the rest of the thesis represents the original contribution. U okviru ove teze posmatra se problem optimizacije bez ograničenja pri čcemu je funkcija cilja u formi matematičkog očekivanja. Očekivanje se odnosi na slučajnu promenljivu koja predstavlja neizvesnost. Zbog toga je funkcija cilja, u stvari, deterministička veličina. Ipak, odredjivanje analitičkog oblika te funkcije cilja može biti vrlo komplikovano pa čak i nemoguće. Zbog toga se za aproksimaciju često koristi uzoračko očcekivanje. Da bi se postigla dobra aproksimacija, obično je neophodan obiman uzorak. Ako pretpostavimo da se uzorak realizuje pre početka procesa optimizacije, možemo posmatrati uzoračko očekivanje kao determinističku funkciju. Medjutim, primena nekog od determinističkih metoda direktno na tu funkciju  moze biti veoma skupa jer evaluacija funkcije pod ocekivanjem često predstavlja veliki trošak i uobičajeno je da se ukupan trošak optimizacije meri po broju izračcunavanja funkcije pod očekivanjem. Zbog toga su razvijeni metodi sa promenljivom veličinom uzorka. Većcina njih je bazirana na odredjivanju optimalne dinamike uvećanja uzorka. Glavni cilj ove teze je razvoj algoritma koji, kroz smanjenje broja izračcunavanja funkcije, smanjuje ukupne trošskove optimizacije. Ideja je da se veličina uzorka smanji kad god je to moguće. Grubo rečeno, izbegava se koriscenje velike preciznosti  (velikog uzorka) kada smo daleko od rešsenja. U čcetvrtom poglavlju ove teze opisana je nova klasa metoda i predstavljena je analiza konvergencije. Dokazano je da je aproksimacija rešenja koju dobijamo bar toliko dobra koliko i za metod koji radi sa celim uzorkom sve vreme. Još jedna bitna karakteristika metoda koji su ovde razmatrani je primena linijskog pretražzivanja u cilju odredjivanja naredne iteracije. Osnovna ideja je da se nadje odgovarajući pravac i da se duž njega vršsi pretraga za dužzinom koraka koja će dovoljno smanjiti vrednost funkcije. Dovoljno smanjenje je odredjeno pravilom linijskog pretraživanja. U čcetvrtom poglavlju to pravilo je monotono što znači da zahtevamo striktno smanjenje vrednosti funkcije. U cilju jos većeg smanjenja troškova optimizacije kao i proširenja skupa pogodnih pravaca, u petom poglavlju koristimo nemonotona pravila linijskog pretraživanja koja su modifikovana zbog promenljive velicine uzorka. Takodje, razmatrani su uslovi za globalnu konvergenciju i R-linearnu brzinu konvergencije. Numerički rezultati su predstavljeni u šestom poglavlju. Test problemi su razliciti - neki od njih su akademski, a neki su realni. Akademski problemi su tu da nam daju bolji uvid u ponašanje algoritama. Sa druge strane, podaci koji poticu od stvarnih problema služe kao pravi test za primenljivost pomenutih algoritama. U prvom delu tog poglavlja akcenat je na načinu ažuriranja veličine uzorka. Različite varijante metoda koji su ovde predloženi porede se medjusobno kao i sa drugim šemama za ažuriranje veličine uzorka. Drugi deo poglavlja pretežno je posvećen poredjenju različitih pravila linijskog pretraživanja sa različitim pravcima pretraživanja u okviru promenljive veličine uzorka. Uzimajuci sve postignute rezultate u obzir dolazi se do zaključcka da variranje veličine uzorka može značajno popraviti učinak algoritma, posebno ako se koriste nemonotone metode linijskog pretraživanja. U prvom poglavlju ove teze opisana je motivacija kao i osnovni pojmovi potrebni za praćenje preostalih poglavlja. U drugom poglavlju je iznet pregled osnova nelinearne optimizacije sa akcentom na metode linijskog pretraživanja, dok su u trećem poglavlju predstavljene osnove stohastičke optimizacije. Pomenuta poglavlja su tu radi pregleda dosadašnjih relevantnih rezultata dok je originalni doprinos ove teze predstavljen u poglavljima 4-6
    corecore