3 research outputs found

    Efficient Online Convex Optimization with Adaptively Minimax Optimal Dynamic Regret

    Full text link
    We introduce an online convex optimization algorithm using projected sub-gradient descent with ideal adaptive learning rates, where each computation is efficiently done in a sequential manner. For the first time in the literature, this algorithm provides an adaptively minimax optimal dynamic regret guarantee for a sequence of convex functions without any restrictions -- such as strong convexity, smoothness or even Lipschitz continuity -- against a comparator decision sequence with bounded total successive changes. We show optimality by generating the worst-case dynamic regret adaptive lower bound, which constitutes of actual sub-gradient norms and matches with our guarantees. We discuss the advantages of our algorithm as opposed to adaptive projection with sub-gradient self outer products and also derive the extension for independent learning in each decision coordinate separately. Additionally, we demonstrate how to best preserve our guarantees when the bound on total successive changes in the dynamic comparator sequence grows as time goes, in a truly online manner.Comment: 10 pages, 1 figure, preprint, [v0] 201

    Online Learning for Changing Environments using Coin Betting

    Full text link
    A key challenge in online learning is that classical algorithms can be slow to adapt to changing environments. Recent studies have proposed "meta" algorithms that convert any online learning algorithm to one that is adaptive to changing environments, where the adaptivity is analyzed in a quantity called the strongly-adaptive regret. This paper describes a new meta algorithm that has a strongly-adaptive regret bound that is a factor of log(T)\sqrt{\log(T)} better than other algorithms with the same time complexity, where TT is the time horizon. We also extend our algorithm to achieve a first-order (i.e., dependent on the observed losses) strongly-adaptive regret bound for the first time, to our knowledge. At its heart is a new parameter-free algorithm for the learning with expert advice (LEA) problem in which experts sometimes do not output advice for consecutive time steps (i.e., \emph{sleeping} experts). This algorithm is derived by a reduction from optimal algorithms for the so-called coin betting problem. Empirical results show that our algorithm outperforms state-of-the-art methods in both learning with expert advice and metric learning scenarios.Comment: submitted to a journal. arXiv admin note: substantial text overlap with arXiv:1610.0457
    corecore