13 research outputs found
Fast Rates in Online Convex Optimization by Exploiting the Curvature of Feasible Sets
In this paper, we explore online convex optimization (OCO) and introduce a
new analysis that provides fast rates by exploiting the curvature of feasible
sets. In online linear optimization, it is known that if the average gradient
of loss functions is larger than a certain value, the curvature of feasible
sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a
logarithmic regret. This paper reveals that algorithms adaptive to the
curvature of loss functions can also leverage the curvature of feasible sets.
We first prove that if an optimal decision is on the boundary of a feasible set
and the gradient of an underlying loss function is non-zero, then the algorithm
achieves a regret upper bound of in stochastic environments.
Here, is the radius of the smallest sphere that includes the optimal
decision and encloses the feasible set. Our approach, unlike existing ones, can
work directly with convex loss functions, exploiting the curvature of loss
functions simultaneously, and can achieve the logarithmic regret only with a
local property of feasible sets. Additionally, it achieves an
regret even in adversarial environments where FTL suffers an
regret, and attains an regret bound in
corrupted stochastic environments with corruption level . Furthermore, by
extending our analysis, we establish a regret upper bound of
for
-uniformly convex feasible sets, where uniformly convex sets include
strongly convex sets and -balls for . This bound
bridges the gap between the regret bound for strongly convex sets
() and the regret bound for non-curved sets ().Comment: 17 page
Perspectives on Incorporating Expert Feedback into Model Updates
Machine learning (ML) practitioners are increasingly tasked with developing
models that are aligned with non-technical experts' values and goals. However,
there has been insufficient consideration on how practitioners should translate
domain expertise into ML updates. In this paper, we consider how to capture
interactions between practitioners and experts systematically. We devise a
taxonomy to match expert feedback types with practitioner updates. A
practitioner may receive feedback from an expert at the observation- or
domain-level, and convert this feedback into updates to the dataset, loss
function, or parameter space. We review existing work from ML and
human-computer interaction to describe this feedback-update taxonomy, and
highlight the insufficient consideration given to incorporating feedback from
non-technical experts. We end with a set of open questions that naturally arise
from our proposed taxonomy and subsequent survey
Online Caching with no Regret: Optimistic Learning via Recommendations
The design of effective online caching policies is an increasingly important
problem for content distribution networks, online social networks and edge
computing services, among other areas. This paper proposes a new algorithmic
toolbox for tackling this problem through the lens of optimistic online
learning. We build upon the Follow-the-Regularized-Leader (FTRL) framework,
which is developed further here to include predictions for the file requests,
and we design online caching algorithms for bipartite networks with fixed-size
caches or elastic leased caches subject to time-average budget constraints. The
predictions are provided by a content recommendation system that influences the
users viewing activity and hence can naturally reduce the caching network's
uncertainty about future requests. We also extend the framework to learn and
utilize the best request predictor in cases where many are available. We prove
that the proposed {optimistic} learning caching policies can achieve sub-zero
performance loss (regret) for perfect predictions, and maintain the sub-linear
regret bound , which is the best achievable bound for policies that
do not use predictions, even for arbitrary-bad predictions. The performance of
the proposed algorithms is evaluated with detailed trace-driven numerical
tests.Comment: arXiv admin note: substantial text overlap with arXiv:2202.1059
Strong Convexity of Sets in Riemannian Manifolds
Convex curvature properties are important in designing and analyzing convex
optimization algorithms in the Hilbertian or Riemannian settings. In the case
of the Hilbertian setting, strongly convex sets are well studied. Herein, we
propose various definitions of strong convexity for uniquely geodesic sets in a
Riemannian manifold. We study their relationship, propose tools to determine
the geodesic strongly convex nature of sets, and analyze the convergence of
optimization algorithms over those sets. In particular, we demonstrate that the
Riemannian Frank-Wolfe algorithm enjoys a global linear convergence rate when
the Riemannian scaling inequalities hold
Online Learning and Bandits with Queried Hints
We consider the classic online learning and stochastic multi-armed bandit
(MAB) problems, when at each step, the online policy can probe and find out
which of a small number () of choices has better reward (or loss) before
making its choice. In this model, we derive algorithms whose regret bounds have
exponentially better dependence on the time horizon compared to the classic
regret bounds. In particular, we show that probing with suffices to
achieve time-independent regret bounds for online linear and convex
optimization. The same number of probes improve the regret bound of stochastic
MAB with independent arms from to , where is
the number of arms and is the horizon length. For stochastic MAB, we also
consider a stronger model where a probe reveals the reward values of the probed
arms, and show that in this case, probes suffice to achieve
parameter-independent constant regret, . Such regret bounds cannot be
achieved even with full feedback after the play, showcasing the power of
limited ``advice'' via probing before making the play. We also present
extensions to the setting where the hints can be imperfect, and to the case of
stochastic MAB where the rewards of the arms can be correlated.Comment: To appear in ITCS 202
Optimistic No-regret Algorithms for Discrete Caching
We take a systematic look at the problem of storing whole files in a cache
with limited capacity in the context of optimistic learning, where the caching
policy has access to a prediction oracle (provided by, e.g., a Neural Network).
The successive file requests are assumed to be generated by an adversary, and
no assumption is made on the accuracy of the oracle. In this setting, we
provide a universal lower bound for prediction-assisted online caching and
proceed to design a suite of policies with a range of performance-complexity
trade-offs. All proposed policies offer sublinear regret bounds commensurate
with the accuracy of the oracle. Our results substantially improve upon all
recently-proposed online caching policies, which, being unable to exploit the
oracle predictions, offer only regret. In this pursuit, we
design, to the best of our knowledge, the first comprehensive optimistic
Follow-the-Perturbed leader policy, which generalizes beyond the caching
problem. We also study the problem of caching files with different sizes and
the bipartite network caching problem. Finally, we evaluate the efficacy of the
proposed policies through extensive numerical experiments using real-world
traces.Comment: Accepted to ACM SIGMETRICS 202