65 research outputs found

    Improved Second-Order Bounds for Prediction with Expert Advice

    Full text link
    This work studies external regret in sequential prediction games with both positive and negative payoffs. External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. In this setting, we derive new and sharper regret bounds for the well-known exponentially weighted average forecaster and for a new forecaster with a different multiplicative update rule. Our analysis has two main advantages: first, no preliminary knowledge about the payoff sequence is needed, not even its range; second, our bounds are expressed in terms of sums of squared payoffs, replacing larger first-order quantities appearing in previous bounds. In addition, our most refined bounds have the natural and desirable property of being stable under rescalings and general translations of the payoff sequence

    Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

    Full text link
    With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, the best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.Comment: 15 pages, 3 figure

    First-order regret bounds for combinatorial semi-bandits

    Get PDF
    We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total losses associated with its decisions. After making each decision, the learner observes the losses associated with its action, but not other losses. For this problem, there are several learning algorithms that guarantee that the learner's expected regret grows as O~(T)\widetilde{O}(\sqrt{T}) with the number of rounds TT. In this paper, we propose an algorithm that improves this scaling to O~(LT∗)\widetilde{O}(\sqrt{{L_T^*}}), where LT∗L_T^* is the total loss of the best action. Our algorithm is among the first to achieve such guarantees in a partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201

    A parameter-free hedging algorithm

    Full text link
    We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret.Comment: Updated Versio

    Online Learning with Low Rank Experts

    Full text link
    We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown dd-dimensional subspace. We devise algorithms with regret bounds that are independent of the number of experts and depend only on the rank dd. For the stochastic model we show a tight bound of Θ(dT)\Theta(\sqrt{dT}), and extend it to a setting of an approximate dd subspace. For the adversarial model we show an upper bound of O(dT)O(d\sqrt{T}) and a lower bound of Ω(dT)\Omega(\sqrt{dT})

    Valuation Compressions in VCG-Based Combinatorial Auctions

    Full text link
    The focus of classic mechanism design has been on truthful direct-revelation mechanisms. In the context of combinatorial auctions the truthful direct-revelation mechanism that maximizes social welfare is the VCG mechanism. For many valuation spaces computing the allocation and payments of the VCG mechanism, however, is a computationally hard problem. We thus study the performance of the VCG mechanism when bidders are forced to choose bids from a subspace of the valuation space for which the VCG outcome can be computed efficiently. We prove improved upper bounds on the welfare loss for restrictions to additive bids and upper and lower bounds for restrictions to non-additive bids. These bounds show that the welfare loss increases in expressiveness. All our bounds apply to equilibrium concepts that can be computed in polynomial time as well as to learning outcomes

    Online Learning in Case of Unbounded Losses Using the Follow Perturbed Leader Algorithm

    Full text link
    In this paper the sequential prediction problem with expert advice is considered for the case where losses of experts suffered at each step cannot be bounded in advance. We present some modification of Kalai and Vempala algorithm of following the perturbed leader where weights depend on past losses of the experts. New notions of a volume and a scaled fluctuation of a game are introduced. We present a probabilistic algorithm protected from unrestrictedly large one-step losses. This algorithm has the optimal performance in the case when the scaled fluctuations of one-step losses of experts of the pool tend to zero.Comment: 31 pages, 3 figure
    • …
    corecore