27 research outputs found

    An Oracle-Structured Bundle Method for Distributed Optimization

    Full text link
    We consider the problem of minimizing a function that is a sum of convex agent functions plus a convex common public function that couples them. The agent functions can only be accessed via a subgradient oracle; the public function is assumed to be structured and expressible in a domain specific language (DSL) for convex optimization. We focus on the case when the evaluation of the agent oracles can require significant effort, which justifies the use of solution methods that carry out significant computation in each iteration. We propose a cutting-plane or bundle-type method for the distributed optimization problem, which has a number of advantages over other methods that are compatible with the access methods, such as proximal subgradient methods: it has very few parameters that need to be tuned; it often produces a reasonable approximate solution in just a few tens of iterations; and it tolerates agent failures. This paper is accompanied by an open source package that implements the proposed method, available at \url{https://github.com/cvxgrp/OSBDO}

    Bundle methods in nonsmooth DC optimization

    Get PDF
    Due to the complexity of many practical applications, we encounter optimization problems with nonsmooth functions, that is, functions which are not continuously differentiable everywhere. Classical gradient-based methods are not applicable to solve such problems, since they may fail in the nonsmooth setting. Therefore, it is imperative to develop numerical methods specifically designed for nonsmooth optimization. To date, bundle methods are considered to be the most efficient and reliable general purpose solvers for this type of problems. The idea in bundle methods is to approximate the subdifferential of the objective function by a bundle of subgradients. This information is then used to build a model for the objective. However, this model is typically convex and, due to this, it may be inaccurate and unable to adequately reflect the behaviour of the objective function in the nonconvex case. These circumstances motivate to design new bundle methods based on nonconvex models of the objective function. In this dissertation, the main focus is on nonsmooth DC optimization that constitutes an important and broad subclass of nonconvex optimization problems. A DC function can be presented as a difference of two convex functions. Thus, we can obtain a model that utilizes explicitly both the convexity and concavity of the objective by approximating separately the convex and concave parts. This way we end up with a nonconvex DC model describing the problem more accurately than the convex one. Based on the new DC model we introduce three different bundle methods. Two of them are designed for unconstrained DC optimization and the third one is capable of solving also multiobjective and constrained DC problems. The finite convergence is proved for each method. The numerical results demonstrate the efficiency of the methods and show the benefits obtained from the utilization of the DC decomposition. Even though the usage of the DC decomposition can improve the performance of the bundle methods, it is not always available or possible to construct. Thus, we present another bundle method for a general objective function implicitly collecting information about the DC structure. This method is developed for large-scale nonsmooth optimization and its convergence is proved for semismooth functions. The efficiency of the method is shown with numerical results. As an application of the developed methods, we consider the clusterwise linear regression (CLR) problems. By applying the support vector machines (SVM) approach a new model for these problems is proposed. The objective in the new formulation of the CLR problem is expressed as a DC function and a method based on one of the presented bundle methods is designed to solve it. Numerical results demonstrate robustness of the new approach to outliers.Monissa käytännön sovelluksissa tarkastelun kohteena oleva ongelma on monimutkainen ja joudutaan näin ollen mallintamaan epäsileillä funktioilla, jotka eivät välttämättä ole jatkuvasti differentioituvia kaikkialla. Klassisia gradienttiin perustuvia optimointimenetelmiä ei voida käyttää epäsileisiin tehtäviin, sillä epäsileillä funktioilla ei ole olemassa klassista gradienttia kaikkialla. Näin ollen epäsileään optimointiin on välttämätöntä kehittää omia numeerisia ratkaisumenetelmiä. Näistä kimppumenetelmiä pidetään tällä hetkellä kaikista tehokkaimpina ja luotettavimpina yleismenetelminä kyseisten tehtävien ratkaisemiseksi. Ideana kimppumenetelmissä on approksimoida kohdefunktion alidifferentiaalia kimpulla, joka on muodostettu keräämällä kohdefunktion aligradientteja edellisiltä iteraatiokierroksilta. Tätä tietoa hyödyntämällä voidaan muodostaa kohdefunktiolle malli, joka on alkuperäistä tehtävää helpompi ratkaista. Käytetty malli on tyypillisesti konveksi ja näin ollen se voi olla epätarkka ja kykenemätön esittämään alkuperäisen tehtävän rakennetta epäkonveksissa tapauksessa. Tästä syystä väitöskirjassa keskitytään kehittämään uusia kimppumenetelmiä, jotka mallinnusvaiheessa muodostavat kohdefunktiolle epäkonveksin mallin. Pääpaino väitöskirjassa on epäsileissä optimointitehtävissä, joissa funktiot voidaan esittää kahden konveksin funktion erotuksena (difference of two convex functions). Kyseisiä funktioita kutsutaan DC-funktioiksi ja ne muodostavat tärkeän ja laajan epäkonveksien funktioiden osajoukon. Tämä valinta mahdollistaa kohdefunktion konveksisuuden ja konkaavisuuden eksplisiittisen hyödyntämisen, sillä uusi malli kohdefunktiolle muodostetaan yhdistämällä erilliset konveksille ja konkaaville osalle rakennetut mallit. Tällä tavalla päädytään epäkonveksiin DC-malliin, joka pystyy kuvaamaan ratkaistavaa tehtävää tarkemmin kuin konveksi arvio. Väitöskirjassa esitetään kolme erilaista uuden DC-mallin pohjalta kehitettyä kimppumenetelmää sekä todistetaan menetelmien konvergenssit. Kaksi näistä menetelmistä on suunniteltu rajoitteettomaan DC-optimointiin ja kolmannella voidaan ratkaista myös monitavoitteisia ja rajoitteellisia DC-optimointitehtäviä. Numeeriset tulokset havainnollistavat menetelmien tehokkuutta sekä DC-hajotelman käytöstä saatuja etuja. Vaikka DC-hajotelman käyttö voi parantaa kimppumenetelmien suoritusta, sitä ei aina ole saatavilla tai mahdollista muodostaa. Tästä syystä väitöskirjassa esitetään myös neljäs kimppumenetelmä konvergenssitodistuksineen yleiselle kohdefunktiolle, jossa kerätään implisiittisesti tietoa kohdefunktion DC-rakenteesta. Menetelmä on kehitetty erityisesti suurille epäsileille optimointitehtäville ja sen tehokkuus osoitetaan numeerisella testauksella Sovelluksena väitöskirjassa tarkastellaan datalle klustereittain tehtävää lineaarista regressiota (clusterwise linear regression). Kyseiselle sovellukselle muodostetaan uusi malli hyödyntäen koneoppimisessa käytettyä SVM-lähestymistapaa (support vector machines approach) ja saatu kohdefunktio esitetään DC-funktiona. Näin ollen yhtä kehitetyistä kimppumenetelmistä sovelletaan tehtävän ratkaisemiseen. Numeeriset tulokset havainnollistavat uuden lähestymistavan robustisuutta ja tehokkuutta

    Structure-Aware Methods for Expensive Derivative-Free Nonsmooth Composite Optimization

    Full text link
    We present new methods for solving a broad class of bound-constrained nonsmooth composite minimization problems. These methods are specially designed for objectives that are some known mapping of outputs from a computationally expensive function. We provide accompanying implementations of these methods: in particular, a novel manifold sampling algorithm (\mspshortref) with subproblems that are in a sense primal versions of the dual problems solved by previous manifold sampling methods and a method (\goombahref) that employs more difficult optimization subproblems. For these two methods, we provide rigorous convergence analysis and guarantees. We demonstrate extensive testing of these methods. Open-source implementations of the methods developed in this manuscript can be found at \url{github.com/POptUS/IBCDFO/}

    An adaptive sampling sequential quadratic programming method for nonsmooth stochastic optimization with upper-C2\mathcal{C}^2 objective

    Full text link
    We propose an optimization algorithm that incorporates adaptive sampling for stochastic nonsmooth nonconvex optimization problems with upper-C2\mathcal{C}^2 objective functions. Upper-C2\mathcal{C}^2 is a weakly concave property that exists naturally in many applications, particularly certain classes of solutions to parametric optimization problems, e.g., recourse of stochastic programming and projection into closed sets. Our algorithm is a stochastic sequential quadratic programming (SQP) method extended to nonsmooth problems with upperC2\mathcal{C}^2 objectives and is globally convergent in expectation with bounded algorithmic parameters. The capabilities of our algorithm are demonstrated by solving a joint production, pricing and shipment problem, as well as a realistic optimal power flow problem as used in current power grid industry practice.Comment: arXiv admin note: text overlap with arXiv:2204.0963

    On the Complexity of Deterministic Nonsmooth and Nonconvex Optimization

    Full text link
    In this paper, we present several new results on minimizing a nonsmooth and nonconvex function under a Lipschitz condition. Recent work shows that while the classical notion of Clarke stationarity is computationally intractable up to some sufficiently small constant tolerance, the randomized first-order algorithms find a (δ,ϵ)(\delta, \epsilon)-Goldstein stationary point with the complexity bound of O~(δ1ϵ3)\tilde{O}(\delta^{-1}\epsilon^{-3}), which is independent of dimension d1d \geq 1~\citep{Zhang-2020-Complexity, Davis-2022-Gradient, Tian-2022-Finite}. However, the deterministic algorithms have not been fully explored, leaving open several problems in nonsmooth nonconvex optimization. Our first contribution is to demonstrate that the randomization is \textit{necessary} to obtain a dimension-independent guarantee, by proving a lower bound of Ω(d)\Omega(d) for any deterministic algorithm that has access to both 1st1^{st} and 0th0^{th} oracles. Furthermore, we show that the 0th0^{th} oracle is \textit{essential} to obtain a finite-time convergence guarantee, by showing that any deterministic algorithm with only the 1st1^{st} oracle is not able to find an approximate Goldstein stationary point within a finite number of iterations up to sufficiently small constant parameter and tolerance. Finally, we propose a deterministic smoothing approach under the \textit{arithmetic circuit} model where the resulting smoothness parameter is exponential in a certain parameter M>0M > 0 (e.g., the number of nodes in the representation of the function), and design a new deterministic first-order algorithm that achieves a dimension-independent complexity bound of O~(Mδ1ϵ3)\tilde{O}(M\delta^{-1}\epsilon^{-3}).Comment: 28 Pages; Fix an error and add relevant reference

    Local convergence of a sequential quadratic programming method for a class of nonsmooth nonconvex objectives

    Full text link
    A sequential quadratic programming (SQP) algorithm is designed for nonsmooth optimization problems with upper-C^2 objective functions. Upper-C^2 functions are locally equivalent to difference-of-convex (DC) functions with smooth convex parts. They arise naturally in many applications such as certain classes of solutions to parametric optimization problems, e.g., recourse of stochastic programming, and projection onto closed sets. The proposed algorithm conducts line search and adopts an exact penalty merit function. The potential inconsistency due to the linearization of constraints are addressed through relaxation, similar to that of Sl_1QP. We show that the algorithm is globally convergent under reasonable assumptions. Moreover, we study the local convergence behavior of the algorithm under additional assumptions of Kurdyka-{\L}ojasiewicz (KL) properties, which have been applied to many nonsmooth optimization problems. Due to the nonconvex nature of the problems, a special potential function is used to analyze local convergence. We show that under acceptable assumptions, upper bounds on local convergence can be proven. Additionally, we show that for a large number of optimization problems with upper-C^2 objectives, their corresponding potential functions are indeed KL functions. Numerical experiment is performed with a power grid optimization problem that is consistent with the assumptions and analysis in this paper

    Derivative free algorithms for nonsmooth and global optimization with application in cluster analysis

    Get PDF
    This thesis is devoted to the development of algorithms for solving nonsmooth nonconvex problems. Some of these algorithms are derivative free methods.Doctor of Philosoph
    corecore