16 research outputs found

    Online Combinatorial Linear Optimization via a Frank-Wolfe-based Metarounding Algorithm

    Full text link
    Metarounding is an approach to convert an approximation algorithm for linear optimization over some combinatorial classes to an online linear optimization algorithm for the same class. We propose a new metarounding algorithm under a natural assumption that a relax-based approximation algorithm exists for the combinatorial class. Our algorithm is much more efficient in both theoretical and practical aspects

    MaxHedge: Maximising a Maximum Online

    Get PDF
    We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner's selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general in that it may be specialised to multiple natural online combinatorial problems.Comment: Published in AISTATS 201

    Online Improper Learning with an Approximation Oracle

    Full text link
    We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and require only poly-logarithmically many calls to the approximation oracle per iteration. Furthermore, these algorithms apply to the more general improper learning problems. In the bandit setting, our algorithm also significantly improves the best previously known oracle complexity while maintaining the same regret

    MaxHedge: Maximising a Maximum Online

    Get PDF
    We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner’s selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general that may be specialised to multiple natural online combinatorial problems

    MaxHedge: Maximising a Maximum Online

    Get PDF
    International audienceWe introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner's selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general that may be specialised to multiple natural online combinatorial problems

    Online Learning of Facility Locations

    Get PDF
    In this paper, we provide a rigorous theoretical investigation of an online learning version of the Facility Location problem which is motivated by emerging problems in real-world applications. In our formulation, we are given a set of sites and an online sequence of user requests. At each trial, the learner selects a subset of sites and then incurs a cost for each selected site and an additional cost which is the price of the user’s connection to the nearest site in the selected subset. The problem may be solved by an application of the well-known Hedge algorithm. This would, however, require time and space exponential in the number of the given sites, which motivates our design of a novel quasi-linear time algorithm for this problem, with good theoretical guarantees on its performance

    Dagstuhl News January - December 2000

    Get PDF
    "Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

    Application of multiplicative weights update method in algorithmic game theory

    Get PDF
    In this thesis, we apply the Multiplicative Weights Update Method (MWUM) to the design of approximation algorithms for some optimization problems in game-theoretic settings. Lavi and Swamy {LS05,LS11} introduced a randomized mechanism for combinatorial auctions that uses an approximation algorithm for the underlying optimization problem, so-called social welfare maximization and converts the approximation algorithm to a randomized mechanism that is { truthful-in-expectation}, which means each player maximizes its expected utility by telling the truth. The mechanism is powerful (e.g., see {LS05,LS11,CEF10,HKV11} for applications), but unlikely to be efficient in practice, because it uses the Ellipsoid method. In chapter 2, we follow the general scheme suggested by Lavi and Swamy and replace the Ellipsoid method with MWUM. This results in a faster and simpler approximately truthful-in-expectation mechanism. We also extend their assumption regarding the existence of an exact solution for the LP-relaxation of social welfare maximization. We assume that there exists an approximation algorithm for the LP and establish a new randomized approximation mechanism. In chapter 3, we consider the problem of computing an approximate saddle point, or equivalently equilibrium, for a convex-concave functions F: X\times Y\to \RR, where XX and YY are convex sets of arbitrary dimensions. Our main contribution is the design of a randomized algorithm for computing an \eps-approximation saddle point for FF. Our algorithm is based on combining a technique developed by Grigoriadis and Khachiyan {GK95}, which is a randomized variant of Brown's fictitious play {B51}, with the recent results on random sampling from convex sets (see, e.g., {LV06,V05}). The algorithm finds an \eps-approximation saddle point in an expected number of O\left(\frac{\rho^2(n+m)}{\eps^{2}}\ln\frac{R}{\eps}\right) iterations, where in each iteration two points are sampled from log-concave distributions over strategy sets. It is assumed that XX and YY have inscribed balls of radius 1/R1/R and circumscribing balls of radius RR and ρ=maxxX,yYF(x,y)\rho=\max_{x\in X, y\in Y} |F(x,y)|. In particular, the algorithm requires O^*\left(\frac{\rho^2(n+m)^6}{\eps^{2}}\ln{R}\right) calls to a membership oracle, where O()O^*(\cdot) suppresses polylogarithmic factors that depend on nn, mm, and \eps.In dieser Doktorarbeit verwenden wir die Multiplicative Weights Update Method (MWUM) für den Entwurf von Approximationsalgorithmen für bestimmte Optimierungsprobleme im spieltheoretischen Umfeld. Lavi und Swamy {LS05,LS11} präsentierten einen randomisierten Mechanismus für kombinatorische Auktionen. Sie verwenden dazu einen Approximationsalgorithmus für die Lösung des zugrundeliegenden Optimierungsproblem, das so genannte Social Welfare Maximization Problem, und wandeln diesen zu einem randomisierten Mechanismus um, der im Erwartungsfall anreizkompatibel ist. Dies bedeutet jeder Spieler erreicht den maximalen Gewinn, wenn er sich ehrlich verhält. Der Mechanismus ist sehr mächtig (siehe {LS05,LS11,CEF10,HKV11} für Anwendungen); trotzdem ist es unwahrscheinlich, dass er in der Praxis effizient ist, da hier die Ellipsoidmethode verwendet wird. In Kapitel 2 folgen wir dem von Lavi und Swamy vorgeschlagenem Schema und ersetzen die Ellipsoidmethode durch MWUM. Das Ergebnis ist ein schnellerer, einfacherer und im Erwartungsfall anreizkompatibler Approximationsmechanismus. Wir erweitern ihre Annahme zur Existenz einer exakten Lösung der LP-Relaxierung für das Social Welfare Maximization Problem. Wir nehmen an, dass ein Approximationsalgorithmus für das LP existiert und beschreiben darauf basierend einen neuen randomisierten Approximationsmechanismus. In Kapitel 3 betrachten wir das Problem für konvexe und konkave Funktionen F:X×YRF:X\times Y\rightarrow\mathbb{R}, wobei XX und YY konvexe Mengen von beliebiger Dimension sind, einen Sattelpunkt zu approximieren (oder gleichbedeutend ein Equilibrium). Unser Hauptbeitrag ist der Entwurf eines randomisierten Algorithmus zur Berechnung einer ϵ\epsilon-Näherung eines Sattelpunktes von FF. Unser Algorithmus beruht auf der Kombination einer Technik entwickelt durch Grigoriadis und Khachiyan {GK95}, welche eine zufallsbasierte Variation von Browns Fictitious Play {B51} ist, mit kürzlich erschienenen Resultaten im Bereich der zufälligen Stichprobennahme aus konvexen Mengen (siehe {LV06,V05}). Der Algorithmus findet eine ϵ\epsilon-Näherung eines Sattelpunktes im Erwartungsfall in O(ρ2(n+m)6ϵ2logRϵ)O(\frac{\rho^{2}(n+m)^{6}}{\epsilon^{2}}\log\frac{R}{\epsilon}) Rechenschritten, wobei in jedem Rechenschritt zwei Punkte zufällig gemäß einer log-konkaven Verteilungen über Strategiemengen gezogen werden. Hier nehmen wir an, dass XX und YY einbeschriebene Kugeln mit Radius 1/R1/R und umschreibende Kugeln von Radius R besitzen und ρ=maxxX,yYF(x,y)\rho=\max_{x\in X,y\in Y}|F(x,y)|. Der Algorithmus benötigt dabei O(ρ2(n+m)6ϵ2logR)O^{*}(\frac{\rho^{2}(n+m)^{6}}{\epsilon^{2}}\log R) Aufrufe eines Zugehörigkeitsorakels, hier versteckt O()O^{*}(\cdot) polylogarithmische Faktoren, die von n,mn,m und ϵ\epsilon abhängen
    corecore