16 research outputs found
Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm
Metarounding is an approach to convert an approximation algorithm for linear
optimization over some combinatorial classes to an online linear optimization
algorithm for the same class. We propose a new metarounding algorithm under a
natural assumption that a relax-based approximation algorithm exists for the
combinatorial class. Our algorithm is much more efficient in both theoretical
and practical aspects
MaxHedge: Maximising a Maximum Online
We introduce a new online learning framework where, at each trial, the
learner is required to select a subset of actions from a given known action
set. Each action is associated with an energy value, a reward and a cost. The
sum of the energies of the actions selected cannot exceed a given energy
budget. The goal is to maximise the cumulative profit, where the profit
obtained on a single trial is defined as the difference between the maximum
reward among the selected actions and the sum of their costs. Action energy
values and the budget are known and fixed. All rewards and costs associated
with each action change over time and are revealed at each trial only after the
learner's selection of actions. Our framework encompasses several online
learning problems where the environment changes over time; and the solution
trades-off between minimising the costs and maximising the maximum reward of
the selected subset of actions, while being constrained to an action energy
budget. The algorithm that we propose is efficient and general in that it may
be specialised to multiple natural online combinatorial problems.Comment: Published in AISTATS 201
Online Improper Learning with an Approximation Oracle
We revisit the question of reducing online learning to approximate
optimization of the offline problem. In this setting, we give two algorithms
with near-optimal performance in the full information setting: they guarantee
optimal regret and require only poly-logarithmically many calls to the
approximation oracle per iteration. Furthermore, these algorithms apply to the
more general improper learning problems. In the bandit setting, our algorithm
also significantly improves the best previously known oracle complexity while
maintaining the same regret
MaxHedge: Maximising a Maximum Online
We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner’s selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general that may be specialised to multiple natural online combinatorial problems
MaxHedge: Maximising a Maximum Online
International audienceWe introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner's selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general that may be specialised to multiple natural online combinatorial problems
Online Learning of Facility Locations
In this paper, we provide a rigorous theoretical investigation of an online learning version of the Facility Location problem which is motivated by emerging problems in real-world applications. In our formulation, we are given a set of sites and an online sequence of user requests. At each trial, the learner selects a subset of sites and then incurs a cost for each selected site and an additional cost which is the price of the user’s connection to the nearest site in the selected subset. The problem may be solved by an application of the well-known Hedge algorithm. This would, however, require time and space exponential in the number of the given sites, which motivates our design of a novel quasi-linear time algorithm for this problem, with good theoretical guarantees on its performance
Dagstuhl News January - December 2000
"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic
Application of multiplicative weights update method in algorithmic game theory
In this thesis, we apply the Multiplicative Weights Update Method (MWUM) to the design of approximation algorithms for some optimization problems in game-theoretic settings. Lavi and Swamy {LS05,LS11} introduced a randomized mechanism for combinatorial auctions that uses an approximation algorithm for the underlying optimization problem, so-called social welfare maximization and converts the approximation algorithm to a randomized mechanism that is { truthful-in-expectation}, which means each player maximizes its expected utility by telling the truth. The mechanism is powerful (e.g., see {LS05,LS11,CEF10,HKV11} for applications), but unlikely to be efficient in practice, because it uses the Ellipsoid method. In chapter 2, we follow the general scheme suggested by Lavi and Swamy and replace the Ellipsoid method with MWUM. This results in a faster and simpler approximately truthful-in-expectation mechanism. We also extend their assumption regarding the existence of an exact solution for the LP-relaxation of social welfare maximization. We assume that there exists an approximation algorithm for the LP and establish a new randomized approximation mechanism. In chapter 3, we consider the problem of computing an approximate saddle point, or equivalently equilibrium, for a convex-concave functions F: X\times Y\to \RR, where and are convex sets of arbitrary dimensions. Our main contribution is the design of a randomized algorithm for computing an \eps-approximation saddle point for . Our algorithm is based on combining a technique developed by Grigoriadis and Khachiyan {GK95}, which is a randomized variant of Brown's fictitious play {B51}, with the recent results on random sampling from convex sets (see, e.g., {LV06,V05}). The algorithm finds an \eps-approximation saddle point in an expected number of O\left(\frac{\rho^2(n+m)}{\eps^{2}}\ln\frac{R}{\eps}\right) iterations, where in each iteration two points are sampled from log-concave distributions over strategy sets. It is assumed that and have inscribed balls of radius and circumscribing balls of radius and . In particular, the algorithm requires O^*\left(\frac{\rho^2(n+m)^6}{\eps^{2}}\ln{R}\right) calls to a membership oracle, where suppresses polylogarithmic factors that depend on , , and \eps.In dieser Doktorarbeit verwenden wir die Multiplicative Weights Update Method (MWUM) für den Entwurf von Approximationsalgorithmen für bestimmte Optimierungsprobleme im spieltheoretischen Umfeld. Lavi und Swamy {LS05,LS11} präsentierten einen randomisierten Mechanismus für kombinatorische Auktionen. Sie verwenden dazu einen Approximationsalgorithmus für die Lösung des zugrundeliegenden Optimierungsproblem, das so genannte Social Welfare Maximization Problem, und wandeln diesen zu einem randomisierten Mechanismus um, der im Erwartungsfall anreizkompatibel ist. Dies bedeutet jeder Spieler erreicht den maximalen Gewinn, wenn er sich ehrlich verhält. Der Mechanismus ist sehr mächtig (siehe {LS05,LS11,CEF10,HKV11} für Anwendungen); trotzdem ist es unwahrscheinlich, dass er in der Praxis effizient ist, da hier die Ellipsoidmethode verwendet wird.
In Kapitel 2 folgen wir dem von Lavi und Swamy vorgeschlagenem Schema und ersetzen die Ellipsoidmethode durch MWUM. Das Ergebnis ist ein schnellerer, einfacherer und im Erwartungsfall anreizkompatibler Approximationsmechanismus.
Wir erweitern ihre Annahme zur Existenz einer exakten Lösung der LP-Relaxierung für das Social Welfare Maximization Problem. Wir nehmen an, dass ein Approximationsalgorithmus für das LP existiert und beschreiben darauf basierend einen neuen randomisierten Approximationsmechanismus.
In Kapitel 3 betrachten wir das Problem für konvexe und konkave Funktionen , wobei und konvexe Mengen von beliebiger Dimension sind, einen Sattelpunkt zu approximieren (oder gleichbedeutend ein Equilibrium). Unser Hauptbeitrag ist der Entwurf eines randomisierten Algorithmus zur Berechnung einer -Näherung eines Sattelpunktes von . Unser Algorithmus beruht auf der Kombination einer Technik entwickelt durch Grigoriadis und Khachiyan {GK95}, welche eine zufallsbasierte Variation von Browns Fictitious Play {B51} ist, mit kürzlich erschienenen Resultaten im Bereich der zufälligen Stichprobennahme aus konvexen Mengen (siehe {LV06,V05}). Der Algorithmus findet eine -Näherung eines Sattelpunktes im Erwartungsfall in Rechenschritten, wobei in jedem Rechenschritt zwei Punkte zufällig gemäß einer log-konkaven Verteilungen über Strategiemengen gezogen werden. Hier nehmen wir an, dass und einbeschriebene Kugeln mit Radius und umschreibende Kugeln von Radius R besitzen und . Der Algorithmus benötigt dabei Aufrufe eines Zugehörigkeitsorakels, hier versteckt polylogarithmische Faktoren, die von und abhängen