21 research outputs found

    Modifikacije metoda NJutnovog tipa za rešavanje semi-glatkih problema stohastičke optimizacije

    Get PDF
     In numerous optimization problems originating from real-world and scientific applications, we often face nonsmoothness. A large number of problems belong to this class, from models of natural phenomena that exhibit sudden changes, shape optimization, to hinge loss functions in machine learning and deep neural networks. In practice, solving a on smooth convex problem tends to be more challenging, usually more difficult and costly than a smooth one. The aim of this thesis is the formulation and theoretical analysis of Newton-type algorithms for solving nonsmooth convex stochastic optimization problems. The optimization problems with the objective function given in the form of a mathematical expectation without differentiability assumption of the function are considered. The Sample Average Approximation (SAA) is used to estimate the objective function. As the accuracy of the SAA objective functions and its derivatives is naturally proportional to the computational costs – higher precision implies larger costs in general, it is important to design an efficient balance between accuracy and costs. Therefore, the main focus of this thesis is the development of adaptive sample size control algorithms in a nonsmooth environment, with particular attention given to the control of the accuracy and selection of search directions. Several options are investigated for the search direction, while the accuracy control involves cheaper objective function approximations (with looser accuracy) during the initial stages of the process to save computational effort. This approach aims to conserve computational resources, reserving the deployment of high-accuracy objective function approximations for the final stages of the optimization process. A detailed description of the proposed methods is presented in Chapter 5 and 6. Also, the theoretical properties of the numerical procedures are analyzed, i.e., their convergence is proved, and the complexity of the developed methods is studied. In addition to the theoretical framework, the successful practical implementation of the given algorithms is presented. It is shown that the proposed methods are more efficient in practical application compared to the existing methods from the literature. Chapter 1 of this thesis serves as a foundation for the subsequent chapters by providing the necessary background information. Chapter 2 covers the fundamentals of nonlinear optimization, with a particular emphasis on line search techniques. In Chapter 3, the focus shifts to the nonsmooth framework. This chapter serves the purpose of reviewing the existing knowledge and established results in the field. The remaining sections of the thesis, starting from Chapter 4, where the framework for the subject of this thesis (the minimization of the expected value function) is introduced, onwards, represent the original contribution made by the author.У бројним проблемима оптимизације који потичу из стварних и научних примена, често се суочавамо са недиференцијабилношћу. У ову класу спада велики број проблема, од модела природних феномена који показују нагле промене, оптимизације облика, до функције циља у машинском учењу и дубоким неуронским мрежама. У пракси, решавање семи-глатких конвексних проблема обично је изазовније и захтева веће рачунске трошкове у односу на глатке проблеме. Циљ ове тезе је формулација и теоријска анализа метода Њутновог типа за решавање семи-глатких конвексних стохастичких проблема оптимизације. Разматрани су проблеми оптимизације са функцијом циља датом у облику математичког очекивања без претпоставке о диференцијабилности функције. Како је врло тешко, па некад чак и немогуће одредити аналитички облик математичког очекивања, функција циља се апроксимира узорачким очекивањем. Имајући у виду да је тачност апроксимације функције циља и њених извода пропорционална рачунским трошковима – већа прецизност подразумева веће трошкове у општем случају, важно је дизајнирати ефикасан баланс између тачности и трошкова. Стога, главни фокус ове тезе је развојалгоритама базираних на одређивању оптималне динамике увећања узорка у семи-глатком окружењу, са посебном пажњом на контроли тачности и одабиру праваца претраге. По питању одабира правца, размотрено је неколико опција, док контрола тачности укључује јефтиније апроксимације функције циља (са мањом прецизношћу) током почетних фаза процеса да би се уштедели рачунски напори. Овај приступ има за циљ очување рачунских ресурса, резервишући примену апроксимација функције циља високе тачности за завршне фазе процеса оптимизације. Детаљан опис предложених метода представљен је у поглављима 5 и 6, где су анализиране и теоријске особине нумеричких поступака, тј. доказана је њихова конвергенција и приказана сложеност развијених метода. Поред теоријског оквира, потврђена је успешна практична имплементација датих алгоритама. Показано је да су предложене методе ефикасније у практичној примени у односу на постојеће методе из литературе. Поглавље 1 ове тезе служи као основа за праћење наредних поглавља пружајући преглед основних појмова. Поглавље 2 се односи на нелинеарну оптимизацију, при чему је посебан акценат стављен на технике линијског претраживања. У поглављу 3 фокус се помера на семи-глатке проблеме оптимизације и методе за њихово решавање и служи као преглед постојећих резултата из ове области. Преостали делови тезе, почевши од поглавља 4, где се уводи проблем изучавања ове тезе (минимизација функције дате у облику очекиване вредности), па надаље, представљају оригинални допринос аутора.U brojnim problemima optimizacije koji potiču iz stvarnih i naučnih primena, često se suočavamo sa nediferencijabilnošću. U ovu klasu spada veliki broj problema, od modela prirodnih fenomena koji pokazuju nagle promene, optimizacije oblika, do funkcije cilja u mašinskom učenju i dubokim neuronskim mrežama. U praksi, rešavanje semi-glatkih konveksnih problema obično je izazovnije i zahteva veće računske troškove u odnosu na glatke probleme. Cilj ove teze je formulacija i teorijska analiza metoda NJutnovog tipa za rešavanje semi-glatkih konveksnih stohastičkih problema optimizacije. Razmatrani su problemi optimizacije sa funkcijom cilja datom u obliku matematičkog očekivanja bez pretpostavke o diferencijabilnosti funkcije. Kako je vrlo teško, pa nekad čak i nemoguće odrediti analitički oblik matematičkog očekivanja, funkcija cilja se aproksimira uzoračkim očekivanjem. Imajući u vidu da je tačnost aproksimacije funkcije cilja i njenih izvoda proporcionalna računskim troškovima – veća preciznost podrazumeva veće troškove u opštem slučaju, važno je dizajnirati efikasan balans između tačnosti i troškova. Stoga, glavni fokus ove teze je razvojalgoritama baziranih na određivanju optimalne dinamike uvećanja uzorka u semi-glatkom okruženju, sa posebnom pažnjom na kontroli tačnosti i odabiru pravaca pretrage. Po pitanju odabira pravca, razmotreno je nekoliko opcija, dok kontrola tačnosti uključuje jeftinije aproksimacije funkcije cilja (sa manjom preciznošću) tokom početnih faza procesa da bi se uštedeli računski napori. Ovaj pristup ima za cilj očuvanje računskih resursa, rezervišući primenu aproksimacija funkcije cilja visoke tačnosti za završne faze procesa optimizacije. Detaljan opis predloženih metoda predstavljen je u poglavljima 5 i 6, gde su analizirane i teorijske osobine numeričkih postupaka, tj. dokazana je njihova konvergencija i prikazana složenost razvijenih metoda. Pored teorijskog okvira, potvrđena je uspešna praktična implementacija datih algoritama. Pokazano je da su predložene metode efikasnije u praktičnoj primeni u odnosu na postojeće metode iz literature. Poglavlje 1 ove teze služi kao osnova za praćenje narednih poglavlja pružajući pregled osnovnih pojmova. Poglavlje 2 se odnosi na nelinearnu optimizaciju, pri čemu je poseban akcenat stavljen na tehnike linijskog pretraživanja. U poglavlju 3 fokus se pomera na semi-glatke probleme optimizacije i metode za njihovo rešavanje i služi kao pregled postojećih rezultata iz ove oblasti. Preostali delovi teze, počevši od poglavlja 4, gde se uvodi problem izučavanja ove teze (minimizacija funkcije date u obliku očekivane vrednosti), pa nadalje, predstavljaju originalni doprinos autora

    Time and Location Aware Mobile Data Pricing

    Full text link
    Mobile users' correlated mobility and data consumption patterns often lead to severe cellular network congestion in peak hours and hot spots. This paper presents an optimal design of time and location aware mobile data pricing, which incentivizes users to smooth traffic and reduce network congestion. We derive the optimal pricing scheme through analyzing a two-stage decision process, where the operator determines the time and location aware prices by minimizing his total cost in Stage I, and each mobile user schedules his mobile traffic by maximizing his payoff (i.e., utility minus payment) in Stage II. We formulate the two-stage decision problem as a bilevel optimization problem, and propose a derivative-free algorithm to solve the problem for any increasing concave user utility functions. We further develop low complexity algorithms for the commonly used logarithmic and linear utility functions. The optimal pricing scheme ensures a win-win situation for the operator and users. Simulations show that the operator can reduce the cost by up to 97.52% in the logarithmic utility case and 98.70% in the linear utility case, and users can increase their payoff by up to 79.69% and 106.10% for the two types of utilities, respectively, comparing with a time and location independent pricing benchmark. Our study suggests that the operator should provide price discounts at less crowded time slots and locations, and the discounts need to be significant when the operator's cost of provisioning excessive traffic is high or users' willingness to delay traffic is low.Comment: This manuscript serves as the online technical report of the article accepted by IEEE Transactions on Mobile Computin

    New bundle methods and U-Lagrangian for generic nonsmooth optimization

    Get PDF
    Nonsmooth optimization consists of minimizing a continuous function by systematically choosing iterative points from the feasible set via the computation of function values and generalized gradients (called subgradients). Broadly speaking, this thesis contains two research themes: nonsmooth optimization algorithms and theories about the substructure of special nonsmooth functions. Specifically, in terms of algorithms, we develop new bundle methods and bundle trust region methods for generic nonsmooth optimization. For theoretical work, we generalize the notion of U-Lagrangian and investigate its connections with some subsmooth structures. This PhD project develops trust region methods for generic nonsmooth optimization. It assumes the functions are Lipschitz continuous and the optimization problem is not necessarily convex. Currently the project also assumes the objective function is prox-regular but no structural information is given. Trust region methods create a local model of the problem in a neighborhood of the iteration point (called the `Trust Region'). They minimize the model over the Trust Region and consider the minimizer as a trial point for next iteration. If the model is an appropriate approximation of the objective function then the trial point is expected to generate function reduction. The model problem is usually easy to solve. Therefore by comparing the reduction of the model's value and that of the real problem, trust region methods adjust the radius of the trust region to continue to obtain reduction by solving model problems. At the end of this project, it is clear that (1) It is possible to develop a pure bundle method with linear subproblems and without trust region update for convex optimization problems; such method converges to minimizers if it generates an infinite sequence of serious steps; otherwise, it can be shown that the method generates a sequence of minor updates and the last serious step is a minimizer. First, this PhD project develops a bundle trust region algorithm with linear model and linear subproblem for minimizing a prox-regular and Lipschitz function. It adopts a convexification technique from the redistributed bundle method. Global convergence of the algorithm is established in the sense that the sequence of iterations converges to the fixed point of the proximal-point mapping given that convexification is successful. Preliminary numerical tests on standard academic nonsmooth problems show that the algorithm is comparable to bundle methods with quadratic subproblem. Second, following the philosophy behind bundle method of making full use of the previous information of the iteration process and obtaining a flexible understanding of the function structure, the project revises the algorithm developed in the first part by applying the nonmonotone trust region method.We study the performance of numerical implementation and successively refine the algorithm in an effort to improve its practical performance. Such revisions include allowing the convexification parameter to possibly decrease and the algorithm to restart after a finite process determined by various heuristics. The second theme of this project is about the theories of nonsmooth analysis, focusing on U-Lagrangian. When restricted to a subspace, a nonsmooth function can be differentiable within this space. It is known that for a nonsmooth convex function, at a point, the Euclidean space can be decomposed into two subspaces: U, over which a special Lagrangian (called the U-Lagrangian) can be defined and has nice smooth properties and V space, the orthogonal complement subspace of the U space. In this thesis we generalize the definition of UV-decomposition and U-Lagrangian to the context of nonconvex functions, specifically that of a prox-regular function. Similar work in the literature includes a quadratic sub-Lagrangian. It is our interest to study the feasibility of a linear localized U-Lagrangian. We also study the connections of the new U-Lagrangian and other subsmooth structures including fast tracks and partial smooth functions. This part of the project tries to provide answers to the following questions: (1) based on a generalized UV-decomposition, can we develop a linear U-Lagrangian of a prox-regular function that maintains prox-regularity? (2) through the new U-Lagrangian can we show that partial smoothness and fast tracks are equivalent under prox-regularity? At the end of this project, it is clear that for a function f that is properly prox-regular at a point x*, a new linear localized U-Lagrangian can be defined and its value at 0 coincides with f(x*); under some conditions, it can be proved that the U-Lagrangian is also prox-regular at 0; moreover partial smoothness and fast tracks are equivalent under prox-regularity and other mild conditions

    Indefinite Knapsack Separable Quadratic Programming: Methods and Applications

    Get PDF
    Quadratic programming (QP) has received significant consideration due to an extensive list of applications. Although polynomial time algorithms for the convex case have been developed, the solution of large scale QPs is challenging due to the computer memory and speed limitations. Moreover, if the QP is nonconvex or includes integer variables, the problem is NP-hard. Therefore, no known algorithm can solve such QPs efficiently. Alternatively, row-aggregation and diagonalization techniques have been developed to solve QP by a sub-problem, knapsack separable QP (KSQP), which has a separable objective function and is constrained by a single knapsack linear constraint and box constraints. KSQP can therefore be considered as a fundamental building-block to solve the general QP and is an important class of problems for research. For the convex KSQP, linear time algorithms are available. However, if some quadratic terms or even only one term is negative in KSQP, the problem is known to be NP-hard, i.e. it is notoriously difficult to solve. The main objective of this dissertation is to develop efficient algorithms to solve general KSQP. Thus, the contributions of this dissertation are five-fold. First, this dissertation includes comprehensive literature review for convex and nonconvex KSQP by considering their computational efficiencies and theoretical complexities. Second, a new algorithm with quadratic time worst-case complexity is developed to globally solve the nonconvex KSQP, having open box constraints. Third, the latter global solver is utilized to develop a new bounding algorithm for general KSQP. Fourth, another new algorithm is developed to find a bound for general KSQP in linear time complexity. Fifth, a list of comprehensive applications for convex KSQP is introduced, and direct applications of indefinite KSQP are described and tested with our newly developed methods. Experiments are conducted to compare the performance of the developed algorithms with that of local, global, and commercial solvers such as IBM CPLEX using randomly generated problems in the context of certain applications. The experimental results show that our proposed methods are superior in speed as well as in the quality of solutions

    Classification algorithms on the cell processor

    Get PDF
    The rapid advancement in the capacity and reliability of data storage technology has allowed for the retention of virtually limitless quantity and detail of digital information. Massive information databases are becoming more and more widespread among governmental, educational, scientific, and commercial organizations. By segregating this data into carefully defined input (e.g.: images) and output (e.g.: classification labels) sets, a classification algorithm can be used develop an internal expert model of the data by employing a specialized training algorithm. A properly trained classifier is capable of predicting the output for future input data from the same input domain that it was trained on. Two popular classifiers are Neural Networks and Support Vector Machines. Both, as with most accurate classifiers, require massive computational resources to carry out the training step and can take months to complete when dealing with extremely large data sets. In most cases, utilizing larger training improves the final accuracy of the trained classifier. However, access to the kinds of computational resources required to do so is expensive and out of reach of private or under funded institutions. The Cell Broadband Engine (CBE), introduced by Sony, Toshiba, and IBM has recently been introduced into the market. The current most inexpensive iteration is available in the Sony Playstation 3 ® computer entertainment system. The CBE is a novel multi-core architecture which features many hardware enhancements designed to accelerate the processing of massive amounts of data. These characteristics and the cheap and widespread availability of this technology make the Cell a prime candidate for the task of training classifiers. In this work, the feasibility of the Cell processor in the use of training Neural Networks and Support Vector Machines was explored. In the Neural Network family of classifiers, the fully connected Multilayer Perceptron and Convolution Network were implemented. In the Support Vector Machine family, a Working Set technique known as the Gradient Projection-based Decomposition Technique, as well as the Cascade SVM were implemented

    AIRO 2016. 46th Annual Conference of the Italian Operational Research Society. Emerging Advances in Logistics Systems Trieste, September 6-9, 2016 - Abstracts Book

    Get PDF
    The AIRO 2016 book of abstract collects the contributions from the conference participants. The AIRO 2016 Conference is a special occasion for the Italian Operations Research community, as AIRO annual conferences turn 46th edition in 2016. To reflect this special occasion, the Programme and Organizing Committee, chaired by Walter Ukovich, prepared a high quality Scientific Programme including the first initiative of AIRO Young, the new AIRO poster section that aims to promote the work of students, PhD students, and Postdocs with an interest in Operations Research. The Scientific Programme of the Conference offers a broad spectrum of contributions covering the variety of OR topics and research areas with an emphasis on “Emerging Advances in Logistics Systems”. The event aims at stimulating integration of existing methods and systems, fostering communication amongst different research groups, and laying the foundations for OR integrated research projects in the next decade. Distinct thematic sections follow the AIRO 2016 days starting by initial presentation of the objectives and features of the Conference. In addition three invited internationally known speakers will present Plenary Lectures, by Gianni Di Pillo, Frédéric Semet e Stefan Nickel, gathering AIRO 2016 participants together to offer key presentations on the latest advances and developments in OR’s research
    corecore