Search CORE

3,653 research outputs found

Oracle complexity separation in convex optimization

Author: Dvinskikh Darina
Dvurechensky Pavel
Gasnikov Alexander
Ivanova Anastasiya
Pasechnyuk Dmitry
Tyurin Alexander
Vorontsova Evgeniya
Publication venue
Publication date: 01/01/2020
Field of study

Ubiquitous in machine learning regularized empirical risk minimization problems are often composed of several blocks which can be treated using different types of oracles, e.g., full gradient, stochastic gradient or coordinate derivative. Optimal oracle complexity is known and achievable separately for the full gradient case, the stochastic gradient case, etc. We propose a generic framework to combine optimal algorithms for different types of oracles in order to achieve separate optimal oracle complexity for each block, i.e. for each block the corresponding oracle is called the optimal number of times for a given accuracy. As a particular example, we demonstrate that for a combination of a full gradient oracle and either a stochastic gradient oracle or a coordinate descent oracle our approach leads to the optimal number of oracle calls separately for the full gradient part and the stochastic/coordinate descent part

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Repositorium für Naturwissenschaften und Technik

Oracle Complexity Separation in Convex Optimization

Author: Dvinskikh Darina
Dvurechensky Pavel
Gasnikov Alexander
Ivanova Anastasiya
Pasechnyuk Dmitry
Tyurin Alexander
Vorontsova Evgeniya
Publication venue
Publication date: 29/03/2021
Field of study

Many convex optimization problems have structured objective function written as a sum of functions with different types of oracles (full gradient, coordinate derivative, stochastic gradient) and different evaluation complexity of these oracles. In the strongly convex case these functions also have different condition numbers, which eventually define the iteration complexity of first-order methods and the number of oracle calls required to achieve given accuracy. Motivated by the desire to call more expensive oracle less number of times, in this paper we consider minimization of a sum of two functions and propose a generic algorithmic framework to separate oracle complexities for each component in the sum. As a specific example, for the

\mu

-strongly convex problem

\min_{x\in \mathbb{R}^n} h(x) + g(x)

with

L_h

-smooth function

h

and

L_g

-smooth function

g

, a special case of our algorithm requires, up to a logarithmic factor,

O(\sqrt{L_h/\mu})

first-order oracle calls for

h

and

O(\sqrt{L_g/\mu})

first-order oracle calls for

g

. Our general framework covers also the setting of strongly convex objectives, the setting when

g

is given by coordinate derivative oracle, and the setting when

g

has a finite-sum structure and is available through stochastic gradient oracle. In the latter two cases we obtain respectively accelerated random coordinate descent and accelerated variance reduction methods with oracle complexity separation

arXiv.org e-Print Archive

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes

Author: Blanchard Moïse
Jaillet Patrick
Zhang Junhui
Publication venue
Publication date: 16/06/2023
Field of study

We propose a family of recursive cutting-plane algorithms to solve feasibility problems with constrained memory, which can also be used for first-order convex optimization. Precisely, in order to find a point within a ball of radius

\epsilon

with a separation oracle in dimension

d

-- or to minimize

1

-Lipschitz convex functions to accuracy

\epsilon

over the unit ball -- our algorithms use

\mathcal O(\frac{d^2}{p}\ln \frac{1}{\epsilon})

bits of memory, and make

\mathcal O((C\frac{d}{p}\ln \frac{1}{\epsilon})^p)

oracle calls, for some universal constant

C \geq 1

. The family is parametrized by

p\in[d]

and provides an oracle-complexity/memory trade-off in the sub-polynomial regime

\ln\frac{1}{\epsilon}\gg\ln d

. While several works gave lower-bound trade-offs (impossibility results) -- we explicit here their dependence with

\ln\frac{1}{\epsilon}

, showing that these also hold in any sub-polynomial regime -- to the best of our knowledge this is the first class of algorithms that provides a positive trade-off between gradient descent and cutting-plane methods in any regime with

\epsilon\leq 1/\sqrt d

. The algorithms divide the

d

variables into

p

blocks and optimize over blocks sequentially, with approximate separation vectors constructed using a variant of Vaidya's method. In the regime

\epsilon \leq d^{-\Omega(d)}

, our algorithm with

p=d

achieves the information-theoretic optimal memory usage and improves the oracle-complexity of gradient descent

arXiv.org e-Print Archive

Reducing Revenue to Welfare Maximization: Approximation Algorithms and other Generalizations

Author: Cai Yang
Daskalakis Constantinos
Weinberg S. Matthew
Publication venue
Publication date: 01/01/2013
Field of study

It was recently shown in [http://arxiv.org/abs/1207.5518] that revenue optimization can be computationally efficiently reduced to welfare optimization in all multi-dimensional Bayesian auction problems with arbitrary (possibly combinatorial) feasibility constraints and independent additive bidders with arbitrary (possibly combinatorial) demand constraints. This reduction provides a poly-time solution to the optimal mechanism design problem in all auction settings where welfare optimization can be solved efficiently, but it is fragile to approximation and cannot provide solutions to settings where welfare maximization can only be tractably approximated. In this paper, we extend the reduction to accommodate approximation algorithms, providing an approximation preserving reduction from (truthful) revenue maximization to (not necessarily truthful) welfare maximization. The mechanisms output by our reduction choose allocations via black-box calls to welfare approximation on randomly selected inputs, thereby generalizing also our earlier structural results on optimal multi-dimensional mechanisms to approximately optimal mechanisms. Unlike [http://arxiv.org/abs/1207.5518], our results here are obtained through novel uses of the Ellipsoid algorithm and other optimization techniques over {\em non-convex regions}

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

A new Lenstra-type Algorithm for Quasiconvex Polynomial Integer Minimization with Complexity 2^O(n log n)

Author: Ajtai
Anstreicher
Anstreicher
Banaszczyk
Banaszczyk
Bank
Bertsimas
Boyd
Burden
Dinur
Eisenbrand
Grötschel
Heinz
Helfrich
Henk
John
Kannan
Kannan
Kannan
Khachiyan
Khachiyan
Khinchin
Kochol
Kochol
Kumar
Lenstra
Matthias Köppe
Micciancio
Micciancio
Nesterov
Nguyen
Robert Hildebrand
Storjohann
Todd
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

We study the integer minimization of a quasiconvex polynomial with quasiconvex polynomial constraints. We propose a new algorithm that is an improvement upon the best known algorithm due to Heinz (Journal of Complexity, 2005). This improvement is achieved by applying a new modern Lenstra-type algorithm, finding optimal ellipsoid roundings, and considering sparse encodings of polynomials. For the bounded case, our algorithm attains a time-complexity of s (r l M d)^{O(1)} 2^{2n log_2(n) + O(n)} when M is a bound on the number of monomials in each polynomial and r is the binary encoding length of a bound on the feasible region. In the general case, s l^{O(1)} d^{O(n)} 2^{2n log_2(n) +O(n)}. In each we assume d>= 2 is a bound on the total degree of the polynomials and l bounds the maximum binary encoding size of the input.Comment: 28 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref