8,066 research outputs found

    A Drifting-Games Analysis for Online Learning and Applications to Boosting

    Full text link
    We provide a general mechanism to design online learning algorithms based on a minimax analysis within a drifting-games framework. Different online learning settings (Hedge, multi-armed bandit problems and online convex optimization) are studied by converting into various kinds of drifting games. The original minimax analysis for drifting games is then used and generalized by applying a series of relaxations, starting from choosing a convex surrogate of the 0-1 loss function. With different choices of surrogates, we not only recover existing algorithms, but also propose new algorithms that are totally parameter-free and enjoy other useful properties. Moreover, our drifting-games framework naturally allows us to study high probability bounds without resorting to any concentration results, and also a generalized notion of regret that measures how good the algorithm is compared to all but the top small fraction of candidates. Finally, we translate our new Hedge algorithm into a new adaptive boosting algorithm that is computationally faster as shown in experiments, since it ignores a large number of examples on each round.Comment: In NIPS201

    Private Learning Implies Online Learning: An Efficient Reduction

    Full text link
    We study the relationship between the notions of differentially private learning and online learning in games. Several recent works have shown that differentially private learning implies online learning, but an open problem of Neel, Roth, and Wu \cite{NeelAaronRoth2018} asks whether this implication is {\it efficient}. Specifically, does an efficient differentially private learner imply an efficient online learner? In this paper we resolve this open question in the context of pure differential privacy. We derive an efficient black-box reduction from differentially private learning to online learning from expert advice

    UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

    Full text link
    In this work, we address the open problem of finding low-complexity near-optimal multi-armed bandit algorithms for sequential decision making problems. Existing bandit algorithms are either sub-optimal and computationally simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We propose a boosting approach to Upper Confidence Bound based algorithms for stochastic bandits, that we call UCBoost. Specifically, we propose two types of UCBoost algorithms. We show that UCBoost(DD) enjoys O(1)O(1) complexity for each arm per round as well as regret guarantee that is 1/e1/e-close to that of the kl-UCB algorithm. We propose an approximation-based UCBoost algorithm, UCBoost(ϵ\epsilon), that enjoys a regret guarantee ϵ\epsilon-close to that of kl-UCB as well as O(log(1/ϵ))O(\log(1/\epsilon)) complexity for each arm per round. Hence, our algorithms provide practitioners a practical way to trade optimality with computational complexity. Finally, we present numerical results which show that UCBoost(ϵ\epsilon) can achieve the same regret performance as the standard kl-UCB while incurring only 1%1\% of the computational cost of kl-UCB.Comment: Accepted by IJCAI 201

    Optimal and Adaptive Algorithms for Online Boosting

    Full text link
    We study online boosting, the task of converting any weak online learner into a strong online learner. Based on a novel and natural definition of weak online learnability, we develop two online boosting algorithms. The first algorithm is an online version of boost-by-majority. By proving a matching lower bound, we show that this algorithm is essentially optimal in terms of the number of weak learners and the sample complexity needed to achieve a specified accuracy. This optimal algorithm is not adaptive however. Using tools from online loss minimization, we derive an adaptive online boosting algorithm that is also parameter-free, but not optimal. Both algorithms work with base learners that can handle example importance weights directly, as well as by rejection sampling examples with probability defined by the booster. Results are complemented with an extensive experimental study

    Fast rates with high probability in exp-concave statistical learning

    Full text link
    We present an algorithm for the statistical learning setting with a bounded exp-concave loss in dd dimensions that obtains excess risk O(dlog(1/δ)/n)O(d \log(1/\delta)/n) with probability at least 1δ1 - \delta. The core technique is to boost the confidence of recent in-expectation O(d/n)O(d/n) excess risk bounds for empirical risk minimization (ERM), without sacrificing the rate, by leveraging a Bernstein condition which holds due to exp-concavity. We also show that with probability 1δ1 - \delta the standard ERM method obtains excess risk O(d(log(n)+log(1/δ))/n)O(d (\log(n) + \log(1/\delta))/n). We further show that a regret bound for any online learner in this setting translates to a high probability excess risk bound for the corresponding online-to-batch conversion of the online learner. Lastly, we present two high probability bounds for the exp-concave model selection aggregation problem that are quantile-adaptive in a certain sense. The first bound is a purely exponential weights type algorithm, obtains a nearly optimal rate, and has no explicit dependence on the Lipschitz continuity of the loss. The second bound requires Lipschitz continuity but obtains the optimal rate.Comment: added results on model selection aggregation (Section 7

    Learning Reductions that Really Work

    Full text link
    We provide a summary of the mathematical and computational techniques that have enabled learning reductions to effectively address a wide class of problems, and show that this approach to solving machine learning problems can be broadly useful

    Thompson Sampling for the MNL-Bandit

    Full text link
    We consider a sequential subset selection problem under parameter uncertainty, where at each time step, the decision maker selects a subset of cardinality KK from NN possible items (arms), and observes a (bandit) feedback in the form of the index of one of the items in said subset, or none. Each item in the index set is ascribed a certain value (reward), and the feedback is governed by a Multinomial Logit (MNL) choice model whose parameters are a priori unknown. The objective of the decision maker is to maximize the expected cumulative rewards over a finite horizon TT, or alternatively, minimize the regret relative to an oracle that knows the MNL parameters. We refer to this as the MNL-Bandit problem. This problem is representative of a larger family of exploration-exploitation problems that involve a combinatorial objective, and arise in several important application domains. We present an approach to adapt Thompson Sampling to this problem and show that it achieves near-optimal regret as well as attractive numerical performance.Comment: Accepted for presentation at Conference on Learning Theory (COLT) 201

    Not-So-Random Features

    Full text link
    We propose a principled method for kernel learning, which relies on a Fourier-analytic characterization of translation-invariant or rotation-invariant kernels. Our method produces a sequence of feature maps, iteratively refining the SVM margin. We provide rigorous guarantees for optimality and generalization, interpreting our algorithm as online equilibrium-finding dynamics in a certain two-player min-max game. Evaluations on synthetic and real-world datasets demonstrate scalability and consistent improvements over related random features-based methods.Comment: Published as a conference paper at ICLR 201

    Composite Binary Losses

    Full text link
    We study losses for binary classification and class probability estimation and extend the understanding of them from margin losses to general composite losses which are the composition of a proper loss with a link function. We characterise when margin losses can be proper composite losses, explicitly show how to determine a symmetric loss in full from half of one of its partial losses, introduce an intrinsic parametrisation of composite binary losses and give a complete characterisation of the relationship between proper losses and ``classification calibrated'' losses. We also consider the question of the ``best'' surrogate binary loss. We introduce a precise notion of ``best'' and show there exist situations where two convex surrogate losses are incommensurable. We provide a complete explicit characterisation of the convexity of composite binary losses in terms of the link function and the weight function associated with the proper loss which make up the composite loss. This characterisation suggests new ways of ``surrogate tuning''. Finally, in an appendix we present some new algorithm-independent results on the relationship between properness, convexity and robustness to misclassification noise for binary losses and show that all convex proper losses are non-robust to misclassification noise.Comment: 38 pages, 4 figures. Submitted to JML

    Exp-Concavity of Proper Composite Losses

    Full text link
    The goal of online prediction with expert advice is to find a decision strategy which will perform almost as well as the best expert in a given pool of experts, on any sequence of outcomes. This problem has been widely studied and O(T)O(\sqrt{T}) and O(logT)O(\log{T}) regret bounds can be achieved for convex losses (\cite{zinkevich2003online}) and strictly convex losses with bounded first and second derivatives (\cite{hazan2007logarithmic}) respectively. In special cases like the Aggregating Algorithm (\cite{vovk1995game}) with mixable losses and the Weighted Average Algorithm (\cite{kivinen1999averaging}) with exp-concave losses, it is possible to achieve O(1)O(1) regret bounds. \cite{van2012exp} has argued that mixability and exp-concavity are roughly equivalent under certain conditions. Thus by understanding the underlying relationship between these two notions we can gain the best of both algorithms (strong theoretical performance guarantees of the Aggregating Algorithm and the computational efficiency of the Weighted Average Algorithm). In this paper we provide a complete characterization of the exp-concavity of any proper composite loss. Using this characterization and the mixability condition of proper losses (\cite{van2012mixability}), we show that it is possible to transform (re-parameterize) any β\beta-mixable binary proper loss into a β\beta-exp-concave composite loss with the same β\beta. In the multi-class case, we propose an approximation approach for this transformation
    corecore