67 research outputs found
A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems
We consider a class of nonconvex nonsmooth optimization problems whose
objective is the sum of a smooth function and a finite number of nonnegative
proper closed possibly nonsmooth functions (whose proximal mappings are easy to
compute), some of which are further composed with linear maps. This kind of
problems arises naturally in various applications when different regularizers
are introduced for inducing simultaneous structures in the solutions. Solving
these problems, however, can be challenging because of the coupled nonsmooth
functions: the corresponding proximal mapping can be hard to compute so that
standard first-order methods such as the proximal gradient algorithm cannot be
applied efficiently. In this paper, we propose a successive
difference-of-convex approximation method for solving this kind of problems. In
this algorithm, we approximate the nonsmooth functions by their Moreau
envelopes in each iteration. Making use of the simple observation that Moreau
envelopes of nonnegative proper closed functions are continuous {\em
difference-of-convex} functions, we can then approximately minimize the
approximation function by first-order methods with suitable majorization
techniques. These first-order methods can be implemented efficiently thanks to
the fact that the proximal mapping of {\em each} nonsmooth function is easy to
compute. Under suitable assumptions, we prove that the sequence generated by
our method is bounded and any accumulation point is a stationary point of the
objective. We also discuss how our method can be applied to concrete
applications such as nonconvex fused regularized optimization problems and
simultaneously structured matrix optimization problems, and illustrate the
performance numerically for these two specific applications
Convergence Properties of Monotone and Nonmonotone Proximal Gradient Methods Revisited
Composite optimization problems, where the sum of a smooth and a merely lower
semicontinuous function has to be minimized, are often tackled numerically by
means of proximal gradient methods as soon as the lower semicontinuous part of
the objective function is of simple enough structure. The available convergence
theory associated with these methods (mostly) requires the derivative of the
smooth part of the objective function to be (globally) Lipschitz continuous,
and this might be a restrictive assumption in some practically relevant
scenarios. In this paper, we readdress this classical topic and provide
convergence results for the classical (monotone) proximal gradient method and
one of its nonmonotone extensions which are applicable in the absence of
(strong) Lipschitz assumptions. This is possible since, for the price of
forgoing convergence rates, we omit the use of descent-type lemmas in our
analysis.Comment: 23 page
A two-phase gradient method for quadratic programming problems with a single linear constraint and bounds on the variables
We propose a gradient-based method for quadratic programming problems with a
single linear constraint and bounds on the variables. Inspired by the GPCG
algorithm for bound-constrained convex quadratic programming [J.J. Mor\'e and
G. Toraldo, SIAM J. Optim. 1, 1991], our approach alternates between two phases
until convergence: an identification phase, which performs gradient projection
iterations until either a candidate active set is identified or no reasonable
progress is made, and an unconstrained minimization phase, which reduces the
objective function in a suitable space defined by the identification phase, by
applying either the conjugate gradient method or a recently proposed spectral
gradient method. However, the algorithm differs from GPCG not only because it
deals with a more general class of problems, but mainly for the way it stops
the minimization phase. This is based on a comparison between a measure of
optimality in the reduced space and a measure of bindingness of the variables
that are on the bounds, defined by extending the concept of proportioning,
which was proposed by some authors for box-constrained problems. If the
objective function is bounded, the algorithm converges to a stationary point
thanks to a suitable application of the gradient projection method in the
identification phase. For strictly convex problems, the algorithm converges to
the optimal solution in a finite number of steps even in case of degeneracy.
Extensive numerical experiments show the effectiveness of the proposed
approach.Comment: 30 pages, 17 figure
Negativna selekcija - Apsolutna mera algoritamskog izvršenja proizvoljnog naloga
Algorithmic trading is an automated process of order execution on electronic stock markets. It can be applied to a broad range of financial instruments, and it is characterized by a signicant investors' control over the execution of his/her orders, with the principal goal of finding the right balance between costs and risk of not (fully) executing an order. As the measurement of execution performance gives information whether best execution is achieved, a signicant number of diffeerent benchmarks is used in practice. The most frequently used are price benchmarks, where some of them are determined before trading (Pre-trade benchmarks), some during the trading day (In-traday benchmarks), and some are determined after the trade (Post-trade benchmarks). The two most dominant are VWAP and Arrival Price, which is along with other pre-trade price benchmarks known as the Implementation Shortfall (IS). We introduce Negative Selection as a posteriori measure of the execution algorithm performance. It is based on the concept of Optimal Placement, which represents the ideal order that could be executed in a given time win-dow, where the notion of ideal means that it is an order with the best execution price considering market conditions during the time window. Negative Selection is dened as a difference between vectors of optimal and executed orders, with vectors dened as a quantity of shares at specied price positionsin the order book. It is equal to zero when the order is optimally executed; negative if the order is not (completely) filled, and positive if the order is executed but at an unfavorable price. Negative Selection is based on the idea to offer a new, alternative performance measure, which will enable us to find the optimal trajectories and construct optimal execution of an order. The first chapter of the thesis includes a list of notation and an overview of denitions and theorems that will be used further in the thesis. Chapters 2 and 3 follow with a theoretical overview of concepts related to market microstructure, basic information regarding benchmarks, and theoretical background of algorithmic trading. Original results are presented in chapters 4 and 5. Chapter 4 includes a construction of optimal placement, definition and properties of Negative Selection. The results regarding the properties of a Negative Selection are given in [35]. Chapter 5 contains the theoretical background for stochastic optimization, a model of the optimal execution formulated as a stochastic optimization problem with regard to Negative Selection, as well as original work on nonmonotone line search method [31], while numerical results are in the last, 6th chapter.Algoritamsko trgovanje je automatizovani proces izvršavanja naloga na elektronskim berzama. Može se primeniti na širok spektar nansijskih instrumenata kojima se trguje na berzi i karakteriše ga značajna kontrola investitora nad izvršavanjem njegovih naloga, pri čemu se teži nalaženju pravog balansa izmedu troška i rizika u vezi sa izvršenjem naloga. S ozirom da se merenjem performasi izvršenja naloga određuje da li je postignuto najbolje izvršenje, u praksi postoji značajan broj različitih pokazatelja. Najčešće su to pokazatelji cena, neki od njih se određuju pre trgovanja (eng. Pre-trade), neki u toku trgovanja (eng. Intraday), a neki nakon trgovanja (eng. Post-trade). Dva najdominantnija pokazatelja cena su VWAP i Arrival Price koji je zajedno sa ostalim "pre-trade" pokazateljima cena poznat kao Implementation shortfall (IS). Pojam negative selekcije se uvodi kao "post-trade" mera performansi algoritama izvršenja, polazeći od pojma optimalnog naloga, koji predstavlja idealni nalog koji se mogao izvrsiti u datom vremenskom intervalu, pri ćemu se pod pojmom "idealni" podrazumeva nalog kojim se postiže najbolja cena u tržišnim uslovima koji su vladali u toku tog vremenskog intervala. Negativna selekcija se definiše kao razlika vektora optimalnog i izvršenog naloga, pri čemu su vektori naloga defisani kao količine akcija na odgovarajućim pozicijama cena knjige naloga. Ona je jednaka nuli kada je nalog optimalno izvršen; negativna, ako nalog nije (u potpunosti) izvršen, a pozitivna ako je nalog izvršen, ali po nepovoljnoj ceni. Uvođenje mere negativne selekcije zasnovano je na ideji da se ponudi nova, alternativna, mera performansi i da se u odnosu na nju nađe optimalna trajektorija i konstruiše optimalno izvršenje naloga. U prvom poglavlju teze dati su lista notacija kao i pregled definicija i teorema neophodnih za izlaganje materije. Poglavlja 2 i 3 bave se teorijskim pregledom pojmova i literature u vezi sa mikrostrukturom tržišta, pokazateljima trgovanja i algoritamskim trgovanjem. Originalni rezultati su predstavljeni u 4. i 5. poglavlju. Poglavlje 4 sadrži konstrukciju optimalnog naloga, definiciju i osobine negativne selekcije. Teorijski i praktični rezultati u vezi sa osobinama negativna selekcije dati su u [35]. Poglavlje 5 sadrži teorijske osnove stohastičke optimizacije, definiciju modela za optimalno izvršenje, kao i originalni rad u vezi sa metodom nemonotonog linijskog pretraživanja [31], dok 6. poglavlje sadrži empirijske rezultate
Existence and solution methods for equilibria
Equilibrium problems provide a mathematical framework which includes optimization, variational inequalities, fixed-point and saddle point problems, and noncooperative games as particular cases. This general format received an increasing interest in the last decade mainly because many theoretical and algorithmic results developed for one of these models can be often extended to the others through the unifying language provided by this common format. This survey paper aims at covering the main results concerning the existence of equilibria and the solution methods for finding them
Limited Memory Steepest Descent Methods for Nonlinear Optimization
This dissertation concerns the development of limited memory steepest descent (LMSD) methods for solving unconstrained nonlinear optimization problems. In particular, we focus on the class of LMSD methods recently proposed by Fletcher, which he has shown to be competitive with well-known quasi-Newton methods such as L-BFGS. However, in the design of such methods, much work remains to be done. First of all, Fletcher only showed a convergence result for LMSD methods when minimizing strongly convex quadratics, but no convergence rate result. In addition, his method mainly focused on minimizing strongly convex quadratics and general convex objectives, while when it comes to nonconvex objectives, open questions remain about how to effectively deal with nonpositive curvature. Furthermore, Fletcher\u27s method relies on having access to exact gradients, which can be a limitation when computing exact gradients is too expensive. The focus of this dissertation is the design and analysis of algorithms intended to solve these issues.In the first part of the new results in this dissertation, a convergence rate result for an LMSD method is proved. For context, we note that a basic LMSD method is an extension of the Barzilai-Borwein ``two-point stepsize\u27\u27 strategy for steepest descent methods for solving unconstrained optimization problems. It is known that the Barzilai-Borwein strategy yields a method with an R-linear rate of convergence when it is employed to minimize a strongly convex quadratic. Our contribution is to extend this analysis for LMSD, also for strongly convex quadratics. In particular, it is shown that, under reasonable assumptions, the method is R-linearly convergent for any choice of the history length parameter. The results of numerical experiments are also provided to illustrate behaviors of the method that are revealed through the theoretical analysis.The second part proposes an LMSD method for solving unconstrained nonconvex optimization problems. As a steepest descent method, the step computation in each iteration only requires the evaluation of a gradient of the objective function and the calculation of a scalar stepsize. When employed to solve certain convex problems, our method reduces to a variant of LMSD method proposed by Fletcher, which means that, when the history length parameter is set to one, it reduces to a steepest descent method inspired by that proposed by Barzilai and Borwein. However, our method is novel in that we propose new algorithmic features for cases when nonpositive curvature is encountered. That is, our method is particularly suited for solving nonconvex problems. With a nonmonotone line search, we ensure global convergence for a variant of our method. We also illustrate with numerical experiments that our approach often yields superior performance when employed to solve nonconvex problems.In the third part, we propose a limited memory stochastic gradient (LMSG) method for solving optimization problems arising in machine learning. As a start, we focus on problems that are strongly convex. When the dataset is too large such that the computation of full gradients is too expensive, our method computes stepsizes and iterates based on (mini-batch) stochastic gradients. Although in stochastic gradient (SG) methods, a best-tuned fixed stepsize or diminishing stepsize is most widely used, it can be inefficient in practice. Our method adopts a cubic model and always guarantees a positive meaningful stepsize, even when nonpositive curvature is encountered (which can happen when using stochastic gradients, even when the problem is convex). Our approach is based on the LMSD method with cubic regularization proposed in the second part of this dissertation. With a projection of stepsizes, we ensure convergence to a neighborhood of the optimal solution when the interval is fixed and convergence to the optimal solution when the interval is diminishing. We also illustrate with numerical experiments that our approach can outperform an SG method with a fixed stepsize
A distributionally robust index tracking model with the CVaR penalty: tractable reformulation
We propose a distributionally robust index tracking model with the
conditional value-at-risk (CVaR) penalty. The model combines the idea of
distributionally robust optimization for data uncertainty and the CVaR penalty
to avoid large tracking errors. The probability ambiguity is described through
a confidence region based on the first-order and second-order moments of the
random vector involved. We reformulate the model in the form of a min-max-min
optimization into an equivalent nonsmooth minimization problem. We further give
an approximate discretization scheme of the possible continuous random vector
of the nonsmooth minimization problem, whose objective function involves the
maximum of numerous but finite nonsmooth functions. The convergence of the
discretization scheme to the equivalent nonsmooth reformulation is shown under
mild conditions. A smoothing projected gradient (SPG) method is employed to
solve the discretization scheme. Any accumulation point is shown to be a global
minimizer of the discretization scheme. Numerical results on the NASDAQ index
dataset from January 2008 to July 2023 demonstrate the effectiveness of our
proposed model and the efficiency of the SPG method, compared with several
state-of-the-art models and corresponding methods for solving them
Metodi linijskog pretrazivanja sa promenljivom velicinom uzorka
The problem under consideration is an unconstrained optimization problem with the objective function in the form of mathematical ex-pectation. The expectation is with respect to the random variable that represents the uncertainty. Therefore, the objective function is in fact deterministic. However, nding the analytical form of that objective function can be very dicult or even impossible. This is the reason why the sample average approximation is often used. In order to obtain reasonable good approximation of the objective function, we have to use relatively large sample size. We assume that the sample is generated at the beginning of the optimization process and therefore we can consider this sample average objective function as the deterministic one. However, applying some deterministic method on that sample average function from the start can be very costly. The number of evaluations of the function under expectation is a common way of measuring the cost of an algorithm. Therefore, methods that vary the sample size throughout the optimization process are developed. Most of them are trying to determine the optimal dynamics of increasing the sample size. The main goal of this thesis is to develop the clas of methods that can decrease the cost of an algorithm by decreasing the number of function evaluations. The idea is to decrease the sample size whenever it seems to be reasonable - roughly speaking, we do not want to impose a large precision, i.e. a large sample size when we are far away from the solution we search for. The detailed description of the new methods is presented in Chapter 4 together with the convergence analysis. It is shown that the approximate solution is of the same quality as the one obtained by dealing with the full sample from the start. Another important characteristic of the methods that are proposed here is the line search technique which is used for obtaining the sub-sequent iterates. The idea is to nd a suitable direction and to search along it until we obtain a sucient decrease in the function value. The sucient decrease is determined throughout the line search rule. In Chapter 4, that rule is supposed to be monotone, i.e. we are imposing strict decrease of the function value. In order to decrease the cost of the algorithm even more and to enlarge the set of suitable search directions, we use nonmonotone line search rules in Chapter 5. Within that chapter, these rules are modied to t the variable sample size framework. Moreover, the conditions for the global convergence and the R-linear rate are presented. In Chapter 6, numerical results are presented. The test problems are various - some of them are academic and some of them are real world problems. The academic problems are here to give us more insight into the behavior of the algorithms. On the other hand, data that comes from the real world problems are here to test the real applicability of the proposed algorithms. In the rst part of that chapter, the focus is on the variable sample size techniques. Different implementations of the proposed algorithm are compared to each other and to the other sample schemes as well. The second part is mostly devoted to the comparison of the various line search rules combined with dierent search directions in the variable sample size framework. The overall numerical results show that using the variable sample size can improve the performance of the algorithms signicantly, especially when the nonmonotone line search rules are used. The rst chapter of this thesis provides the background material for the subsequent chapters. In Chapter 2, basics of the nonlinear optimization are presented and the focus is on the line search, while Chapter 3 deals with the stochastic framework. These chapters are here to provide the review of the relevant known results, while the rest of the thesis represents the original contribution. U okviru ove teze posmatra se problem optimizacije bez ograničenja pri čcemu je funkcija cilja u formi matematičkog očekivanja. Očekivanje se odnosi na slučajnu promenljivu koja predstavlja neizvesnost. Zbog toga je funkcija cilja, u stvari, deterministička veličina. Ipak, odredjivanje analitičkog oblika te funkcije cilja može biti vrlo komplikovano pa čak i nemoguće. Zbog toga se za aproksimaciju često koristi uzoračko očcekivanje. Da bi se postigla dobra aproksimacija, obično je neophodan obiman uzorak. Ako pretpostavimo da se uzorak realizuje pre početka procesa optimizacije, možemo posmatrati uzoračko očekivanje kao determinističku funkciju. Medjutim, primena nekog od determinističkih metoda direktno na tu funkciju moze biti veoma skupa jer evaluacija funkcije pod ocekivanjem često predstavlja veliki trošak i uobičajeno je da se ukupan trošak optimizacije meri po broju izračcunavanja funkcije pod očekivanjem. Zbog toga su razvijeni metodi sa promenljivom veličinom uzorka. Većcina njih je bazirana na odredjivanju optimalne dinamike uvećanja uzorka. Glavni cilj ove teze je razvoj algoritma koji, kroz smanjenje broja izračcunavanja funkcije, smanjuje ukupne trošskove optimizacije. Ideja je da se veličina uzorka smanji kad god je to moguće. Grubo rečeno, izbegava se koriscenje velike preciznosti (velikog uzorka) kada smo daleko od rešsenja. U čcetvrtom poglavlju ove teze opisana je nova klasa metoda i predstavljena je analiza konvergencije. Dokazano je da je aproksimacija rešenja koju dobijamo bar toliko dobra koliko i za metod koji radi sa celim uzorkom sve vreme. Još jedna bitna karakteristika metoda koji su ovde razmatrani je primena linijskog pretražzivanja u cilju odredjivanja naredne iteracije. Osnovna ideja je da se nadje odgovarajući pravac i da se duž njega vršsi pretraga za dužzinom koraka koja će dovoljno smanjiti vrednost funkcije. Dovoljno smanjenje je odredjeno pravilom linijskog pretraživanja. U čcetvrtom poglavlju to pravilo je monotono što znači da zahtevamo striktno smanjenje vrednosti funkcije. U cilju jos većeg smanjenja troškova optimizacije kao i proširenja skupa pogodnih pravaca, u petom poglavlju koristimo nemonotona pravila linijskog pretraživanja koja su modifikovana zbog promenljive velicine uzorka. Takodje, razmatrani su uslovi za globalnu konvergenciju i R-linearnu brzinu konvergencije. Numerički rezultati su predstavljeni u šestom poglavlju. Test problemi su razliciti - neki od njih su akademski, a neki su realni. Akademski problemi su tu da nam daju bolji uvid u ponašanje algoritama. Sa druge strane, podaci koji poticu od stvarnih problema služe kao pravi test za primenljivost pomenutih algoritama. U prvom delu tog poglavlja akcenat je na načinu ažuriranja veličine uzorka. Različite varijante metoda koji su ovde predloženi porede se medjusobno kao i sa drugim šemama za ažuriranje veličine uzorka. Drugi deo poglavlja pretežno je posvećen poredjenju različitih pravila linijskog pretraživanja sa različitim pravcima pretraživanja u okviru promenljive veličine uzorka. Uzimajuci sve postignute rezultate u obzir dolazi se do zaključcka da variranje veličine uzorka može značajno popraviti učinak algoritma, posebno ako se koriste nemonotone metode linijskog pretraživanja. U prvom poglavlju ove teze opisana je motivacija kao i osnovni pojmovi potrebni za praćenje preostalih poglavlja. U drugom poglavlju je iznet pregled osnova nelinearne optimizacije sa akcentom na metode linijskog pretraživanja, dok su u trećem poglavlju predstavljene osnove stohastičke optimizacije. Pomenuta poglavlja su tu radi pregleda dosadašnjih relevantnih rezultata dok je originalni doprinos ove teze predstavljen u poglavljima 4-6
- …