Search CORE

21,805 research outputs found

Computation with Polynomial Equations and Inequalities arising in Combinatorial Optimization

Author: A Kehrein
AV Gelder
D Grigoriev
D Henrion
E Gilbert
E. Balas
G Reid
G Stengle
H Sherali
J Gouveia
J Lasserre
J Lasserre
JB Lasserre
L Vandenberghe
L. Lov´asz
M Laurent
NZ Shor
PA Parrilo
T Hogg
W Eberly
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/09/2009
Field of study

The purpose of this note is to survey a methodology to solve systems of polynomial equations and inequalities. The techniques we discuss use the algebra of multivariate polynomials with coefficients over a field to create large-scale linear algebra or semidefinite programming relaxations of many kinds of feasibility or optimization questions. We are particularly interested in problems arising in combinatorial optimization.Comment: 28 pages, survey pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

First-order regret bounds for combinatorial semi-bandits

Author: Neu Gergely
Publication venue
Publication date: 10/06/2015
Field of study

We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total losses associated with its decisions. After making each decision, the learner observes the losses associated with its action, but not other losses. For this problem, there are several learning algorithms that guarantee that the learner's expected regret grows as

\widetilde{O}(\sqrt{T})

with the number of rounds

T

. In this paper, we propose an algorithm that improves this scaling to

\widetilde{O}(\sqrt{{L_T^*}})

, where

L_T^*

is the total loss of the best action. Our algorithm is among the first to achieve such guarantees in a partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

An efficient algorithm for learning with semi-bandit feedback

Author: A. György
A. Kalai
C. Allenberg
D. Suehiro
E. Takimoto
H.B. McMahan
J. Hannan
J. Poland
J.-Y. Audibert
N. Cesa-Bianchi
N. Cesa-Bianchi
P. Auer
Publication venue
Publication date: 01/01/2013
Field of study

We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a side result, we also improve the best known regret bounds for FPL in the full information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m) over previous bounds for this algorithm.Comment: submitted to ALT 201

arXiv.org e-Print Archive

Crossref