Search CORE

13 research outputs found

Fast Rates in Online Convex Optimization by Exploiting the Curvature of Feasible Sets

Author: Ito Shinji
Tsuchiya Taira
Publication venue
Publication date: 20/02/2024
Field of study

In this paper, we explore online convex optimization (OCO) and introduce a new analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions is larger than a certain value, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This paper reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. We first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret upper bound of

O(\rho \log T)

in stochastic environments. Here,

\rho > 0

is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set. Our approach, unlike existing ones, can work directly with convex loss functions, exploiting the curvature of loss functions simultaneously, and can achieve the logarithmic regret only with a local property of feasible sets. Additionally, it achieves an

O(\sqrt{T})

regret even in adversarial environments where FTL suffers an

\Omega(T)

regret, and attains an

O(\rho \log T + \sqrt{C \rho \log T})

regret bound in corrupted stochastic environments with corruption level

C

. Furthermore, by extending our analysis, we establish a regret upper bound of

O\Big(T^{\frac{q-2}{2(q-1)}} (\log T)^{\frac{q}{2(q-1)}}\Big)

for

q

-uniformly convex feasible sets, where uniformly convex sets include strongly convex sets and

\ell_p

-balls for

p \in [1,\infty)

. This bound bridges the gap between the

O(\log T)

regret bound for strongly convex sets (

q=2

) and the

O(\sqrt{T})

regret bound for non-curved sets (

q\to\infty

).Comment: 17 page

arXiv.org e-Print Archive

Perspectives on Incorporating Expert Feedback into Model Updates

Author: Bhatt Umang
Chen Valerie
Heidari Hoda
Talwalkar Ameet
Weller Adrian
Publication venue
Publication date: 16/07/2022
Field of study

Machine learning (ML) practitioners are increasingly tasked with developing models that are aligned with non-technical experts' values and goals. However, there has been insufficient consideration on how practitioners should translate domain expertise into ML updates. In this paper, we consider how to capture interactions between practitioners and experts systematically. We devise a taxonomy to match expert feedback types with practitioner updates. A practitioner may receive feedback from an expert at the observation- or domain-level, and convert this feedback into updates to the dataset, loss function, or parameter space. We review existing work from ML and human-computer interaction to describe this feedback-update taxonomy, and highlight the insufficient consideration given to incorporating feedback from non-technical experts. We end with a set of open questions that naturally arise from our proposed taxonomy and subsequent survey

arXiv.org e-Print Archive

Online Caching with no Regret: Optimistic Learning via Recommendations

Author: Iosifidis George
Leith Douglas
Mhaisen Naram
Publication venue
Publication date: 20/04/2022
Field of study

The design of effective online caching policies is an increasingly important problem for content distribution networks, online social networks and edge computing services, among other areas. This paper proposes a new algorithmic toolbox for tackling this problem through the lens of optimistic online learning. We build upon the Follow-the-Regularized-Leader (FTRL) framework, which is developed further here to include predictions for the file requests, and we design online caching algorithms for bipartite networks with fixed-size caches or elastic leased caches subject to time-average budget constraints. The predictions are provided by a content recommendation system that influences the users viewing activity and hence can naturally reduce the caching network's uncertainty about future requests. We also extend the framework to learn and utilize the best request predictor in cases where many are available. We prove that the proposed {optimistic} learning caching policies can achieve sub-zero performance loss (regret) for perfect predictions, and maintain the sub-linear regret bound

O(\sqrt T)

, which is the best achievable bound for policies that do not use predictions, even for arbitrary-bad predictions. The performance of the proposed algorithms is evaluated with detailed trace-driven numerical tests.Comment: arXiv admin note: substantial text overlap with arXiv:2202.1059

arXiv.org e-Print Archive

Strong Convexity of Sets in Riemannian Manifolds

Author: d'Aspremont Alexandre
Kerdreux Thomas
Martínez-Rubio
Pokutta Sebastian
Scieur Damien
Publication venue
Publication date: 06/02/2024
Field of study

Convex curvature properties are important in designing and analyzing convex optimization algorithms in the Hilbertian or Riemannian settings. In the case of the Hilbertian setting, strongly convex sets are well studied. Herein, we propose various definitions of strong convexity for uniquely geodesic sets in a Riemannian manifold. We study their relationship, propose tools to determine the geodesic strongly convex nature of sets, and analyze the convergence of optimization algorithms over those sets. In particular, we demonstrate that the Riemannian Frank-Wolfe algorithm enjoys a global linear convergence rate when the Riemannian scaling inequalities hold

arXiv.org e-Print Archive

Online Learning and Bandits with Queried Hints

Author: Bhaskara Aditya
Gollapudi Sreenivas
Im Sungjin
Kollias Kostas
Munagala Kamesh
Publication venue
Publication date: 04/11/2022
Field of study

We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at each step, the online policy can probe and find out which of a small number (

k

) of choices has better reward (or loss) before making its choice. In this model, we derive algorithms whose regret bounds have exponentially better dependence on the time horizon compared to the classic regret bounds. In particular, we show that probing with

k=2

suffices to achieve time-independent regret bounds for online linear and convex optimization. The same number of probes improve the regret bound of stochastic MAB with independent arms from

O(\sqrt{nT})

O(n^2 \log T)

, where

n

is the number of arms and

T

is the horizon length. For stochastic MAB, we also consider a stronger model where a probe reveals the reward values of the probed arms, and show that in this case,

k=3

probes suffice to achieve parameter-independent constant regret,

O(n^2)

. Such regret bounds cannot be achieved even with full feedback after the play, showcasing the power of limited ``advice'' via probing before making the play. We also present extensions to the setting where the hints can be imperfect, and to the case of stochastic MAB where the rewards of the arms can be correlated.Comment: To appear in ITCS 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Optimistic No-regret Algorithms for Discrete Caching

Author: Iosifidis Georgios
Mhaisen Naram
Paschos Georgios
Sinha Abhishek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/11/2022
Field of study

We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning, where the caching policy has access to a prediction oracle (provided by, e.g., a Neural Network). The successive file requests are assumed to be generated by an adversary, and no assumption is made on the accuracy of the oracle. In this setting, we provide a universal lower bound for prediction-assisted online caching and proceed to design a suite of policies with a range of performance-complexity trade-offs. All proposed policies offer sublinear regret bounds commensurate with the accuracy of the oracle. Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only

O(\sqrt{T})

regret. In this pursuit, we design, to the best of our knowledge, the first comprehensive optimistic Follow-the-Perturbed leader policy, which generalizes beyond the caching problem. We also study the problem of caching files with different sizes and the bipartite network caching problem. Finally, we evaluate the efficacy of the proposed policies through extensive numerical experiments using real-world traces.Comment: Accepted to ACM SIGMETRICS 202

arXiv.org e-Print Archive