27 research outputs found
An Oracle-Structured Bundle Method for Distributed Optimization
We consider the problem of minimizing a function that is a sum of convex
agent functions plus a convex common public function that couples them. The
agent functions can only be accessed via a subgradient oracle; the public
function is assumed to be structured and expressible in a domain specific
language (DSL) for convex optimization. We focus on the case when the
evaluation of the agent oracles can require significant effort, which justifies
the use of solution methods that carry out significant computation in each
iteration. We propose a cutting-plane or bundle-type method for the distributed
optimization problem, which has a number of advantages over other methods that
are compatible with the access methods, such as proximal subgradient methods:
it has very few parameters that need to be tuned; it often produces a
reasonable approximate solution in just a few tens of iterations; and it
tolerates agent failures. This paper is accompanied by an open source package
that implements the proposed method, available at
\url{https://github.com/cvxgrp/OSBDO}
Bundle methods in nonsmooth DC optimization
Due to the complexity of many practical applications, we encounter optimization problems with nonsmooth functions, that is, functions which are not continuously differentiable everywhere. Classical gradient-based methods are not applicable to solve such problems, since they may fail in the nonsmooth setting. Therefore, it is imperative to develop numerical methods specifically designed for nonsmooth optimization. To date, bundle methods are considered to be the most efficient and reliable general purpose solvers for this type of problems.
The idea in bundle methods is to approximate the subdifferential of the objective function by a bundle of subgradients. This information is then used to build a model for the objective. However, this model is typically convex and, due to this, it may be inaccurate and unable to adequately reflect the behaviour of the objective function in the nonconvex case. These circumstances motivate to design new bundle methods based on nonconvex models of the objective function.
In this dissertation, the main focus is on nonsmooth DC optimization that constitutes an important and broad subclass of nonconvex optimization problems. A DC function can be presented as a difference of two convex functions. Thus, we can obtain a model that utilizes explicitly both the convexity and concavity of the objective by approximating separately the convex and concave parts. This way we end up with a nonconvex DC model describing the problem more accurately than the convex one. Based on the new DC model we introduce three different bundle methods. Two of them are designed for unconstrained DC optimization and the third one is capable of solving also multiobjective and constrained DC problems. The finite convergence is proved for each method. The numerical results demonstrate the efficiency of the methods and show the benefits obtained from the utilization of the DC decomposition.
Even though the usage of the DC decomposition can improve the performance of the bundle methods, it is not always available or possible to construct. Thus, we present another bundle method for a general objective function implicitly collecting information about the DC structure. This method is developed for large-scale nonsmooth optimization and its convergence is proved for semismooth functions. The efficiency of the method is shown with numerical results.
As an application of the developed methods, we consider the clusterwise linear regression (CLR) problems. By applying the support vector machines (SVM) approach a new model for these problems is proposed. The objective in the new formulation of the CLR problem is expressed as a DC function and a method based on one of the presented bundle methods is designed to solve it. Numerical results demonstrate robustness of the new approach to outliers.Monissa käytännön sovelluksissa tarkastelun kohteena oleva ongelma on monimutkainen ja joudutaan näin ollen mallintamaan epäsileillä funktioilla, jotka eivät välttämättä ole jatkuvasti differentioituvia kaikkialla. Klassisia gradienttiin perustuvia optimointimenetelmiä ei voida käyttää epäsileisiin tehtäviin, sillä epäsileillä funktioilla ei ole olemassa klassista gradienttia kaikkialla. Näin ollen epäsileään optimointiin on välttämätöntä kehittää omia numeerisia ratkaisumenetelmiä. Näistä kimppumenetelmiä pidetään tällä hetkellä kaikista tehokkaimpina ja luotettavimpina yleismenetelminä kyseisten tehtävien ratkaisemiseksi.
Ideana kimppumenetelmissä on approksimoida kohdefunktion alidifferentiaalia kimpulla, joka on muodostettu keräämällä kohdefunktion aligradientteja edellisiltä iteraatiokierroksilta. Tätä tietoa hyödyntämällä voidaan muodostaa kohdefunktiolle malli, joka on alkuperäistä tehtävää helpompi ratkaista. Käytetty malli on tyypillisesti konveksi ja näin ollen se voi olla epätarkka ja kykenemätön esittämään alkuperäisen tehtävän rakennetta epäkonveksissa tapauksessa. Tästä syystä väitöskirjassa keskitytään kehittämään uusia kimppumenetelmiä, jotka mallinnusvaiheessa muodostavat kohdefunktiolle epäkonveksin mallin.
Pääpaino väitöskirjassa on epäsileissä optimointitehtävissä, joissa funktiot voidaan esittää kahden konveksin funktion erotuksena (difference of two convex functions). Kyseisiä funktioita kutsutaan DC-funktioiksi ja ne muodostavat tärkeän ja laajan epäkonveksien funktioiden osajoukon. Tämä valinta mahdollistaa kohdefunktion konveksisuuden ja konkaavisuuden eksplisiittisen hyödyntämisen, sillä uusi malli kohdefunktiolle muodostetaan yhdistämällä erilliset konveksille ja konkaaville osalle rakennetut mallit. Tällä tavalla päädytään epäkonveksiin DC-malliin, joka pystyy kuvaamaan ratkaistavaa tehtävää tarkemmin kuin konveksi arvio. Väitöskirjassa esitetään kolme erilaista uuden DC-mallin pohjalta kehitettyä kimppumenetelmää sekä todistetaan menetelmien konvergenssit. Kaksi näistä menetelmistä on suunniteltu rajoitteettomaan DC-optimointiin ja kolmannella voidaan ratkaista myös monitavoitteisia ja rajoitteellisia DC-optimointitehtäviä. Numeeriset tulokset havainnollistavat menetelmien tehokkuutta sekä DC-hajotelman käytöstä saatuja etuja.
Vaikka DC-hajotelman käyttö voi parantaa kimppumenetelmien suoritusta, sitä ei aina ole saatavilla tai mahdollista muodostaa. Tästä syystä väitöskirjassa esitetään myös neljäs kimppumenetelmä konvergenssitodistuksineen yleiselle kohdefunktiolle, jossa kerätään implisiittisesti tietoa kohdefunktion DC-rakenteesta. Menetelmä on kehitetty erityisesti suurille epäsileille optimointitehtäville ja sen tehokkuus osoitetaan numeerisella testauksella
Sovelluksena väitöskirjassa tarkastellaan datalle klustereittain tehtävää lineaarista regressiota (clusterwise linear regression). Kyseiselle sovellukselle muodostetaan uusi malli hyödyntäen koneoppimisessa käytettyä SVM-lähestymistapaa (support vector machines approach) ja saatu kohdefunktio esitetään DC-funktiona. Näin ollen yhtä kehitetyistä kimppumenetelmistä sovelletaan tehtävän ratkaisemiseen. Numeeriset tulokset havainnollistavat uuden lähestymistavan robustisuutta ja tehokkuutta
Structure-Aware Methods for Expensive Derivative-Free Nonsmooth Composite Optimization
We present new methods for solving a broad class of bound-constrained
nonsmooth composite minimization problems. These methods are specially designed
for objectives that are some known mapping of outputs from a computationally
expensive function. We provide accompanying implementations of these methods:
in particular, a novel manifold sampling algorithm (\mspshortref) with
subproblems that are in a sense primal versions of the dual problems solved by
previous manifold sampling methods and a method (\goombahref) that employs more
difficult optimization subproblems. For these two methods, we provide rigorous
convergence analysis and guarantees. We demonstrate extensive testing of these
methods. Open-source implementations of the methods developed in this
manuscript can be found at \url{github.com/POptUS/IBCDFO/}
An adaptive sampling sequential quadratic programming method for nonsmooth stochastic optimization with upper- objective
We propose an optimization algorithm that incorporates adaptive sampling for
stochastic nonsmooth nonconvex optimization problems with upper-
objective functions. Upper- is a weakly concave property that
exists naturally in many applications, particularly certain classes of
solutions to parametric optimization problems, e.g., recourse of stochastic
programming and projection into closed sets. Our algorithm is a stochastic
sequential quadratic programming (SQP) method extended to nonsmooth problems
with upper objectives and is globally convergent in expectation
with bounded algorithmic parameters. The capabilities of our algorithm are
demonstrated by solving a joint production, pricing and shipment problem, as
well as a realistic optimal power flow problem as used in current power grid
industry practice.Comment: arXiv admin note: text overlap with arXiv:2204.0963
On the Complexity of Deterministic Nonsmooth and Nonconvex Optimization
In this paper, we present several new results on minimizing a nonsmooth and
nonconvex function under a Lipschitz condition. Recent work shows that while
the classical notion of Clarke stationarity is computationally intractable up
to some sufficiently small constant tolerance, the randomized first-order
algorithms find a -Goldstein stationary point with the
complexity bound of , which is independent
of dimension ~\citep{Zhang-2020-Complexity, Davis-2022-Gradient,
Tian-2022-Finite}. However, the deterministic algorithms have not been fully
explored, leaving open several problems in nonsmooth nonconvex optimization.
Our first contribution is to demonstrate that the randomization is
\textit{necessary} to obtain a dimension-independent guarantee, by proving a
lower bound of for any deterministic algorithm that has access to
both and oracles. Furthermore, we show that the
oracle is \textit{essential} to obtain a finite-time convergence guarantee, by
showing that any deterministic algorithm with only the oracle is not
able to find an approximate Goldstein stationary point within a finite number
of iterations up to sufficiently small constant parameter and tolerance.
Finally, we propose a deterministic smoothing approach under the
\textit{arithmetic circuit} model where the resulting smoothness parameter is
exponential in a certain parameter (e.g., the number of nodes in the
representation of the function), and design a new deterministic first-order
algorithm that achieves a dimension-independent complexity bound of
.Comment: 28 Pages; Fix an error and add relevant reference
Local convergence of a sequential quadratic programming method for a class of nonsmooth nonconvex objectives
A sequential quadratic programming (SQP) algorithm is designed for nonsmooth
optimization problems with upper-C^2 objective functions. Upper-C^2 functions
are locally equivalent to difference-of-convex (DC) functions with smooth
convex parts. They arise naturally in many applications such as certain classes
of solutions to parametric optimization problems, e.g., recourse of stochastic
programming, and projection onto closed sets. The proposed algorithm conducts
line search and adopts an exact penalty merit function. The potential
inconsistency due to the linearization of constraints are addressed through
relaxation, similar to that of Sl_1QP. We show that the algorithm is globally
convergent under reasonable assumptions. Moreover, we study the local
convergence behavior of the algorithm under additional assumptions of
Kurdyka-{\L}ojasiewicz (KL) properties, which have been applied to many
nonsmooth optimization problems. Due to the nonconvex nature of the problems, a
special potential function is used to analyze local convergence. We show that
under acceptable assumptions, upper bounds on local convergence can be proven.
Additionally, we show that for a large number of optimization problems with
upper-C^2 objectives, their corresponding potential functions are indeed KL
functions. Numerical experiment is performed with a power grid optimization
problem that is consistent with the assumptions and analysis in this paper
Derivative free algorithms for nonsmooth and global optimization with application in cluster analysis
This thesis is devoted to the development of algorithms for solving nonsmooth nonconvex problems. Some of these algorithms are derivative free methods.Doctor of Philosoph