4 research outputs found
Online and Stochastic Gradient Methods for Non-decomposable Loss Functions
Modern applications in sensitive domains such as biometrics and medicine
frequently require the use of non-decomposable loss functions such as
precision@k, F-measure etc. Compared to point loss functions such as
hinge-loss, these offer much more fine grained control over prediction, but at
the same time present novel challenges in terms of algorithm design and
analysis. In this work we initiate a study of online learning techniques for
such non-decomposable loss functions with an aim to enable incremental learning
as well as design scalable solvers for batch problems. To this end, we propose
an online learning framework for such loss functions. Our model enjoys several
nice properties, chief amongst them being the existence of efficient online
learning algorithms with sublinear regret and online to batch conversion
bounds. Our model is a provable extension of existing online learning models
for point loss functions. We instantiate two popular losses, prec@k and pAUC,
in our model and prove sublinear regret bounds for both of them. Our proofs
require a novel structural lemma over ranked lists which may be of independent
interest. We then develop scalable stochastic gradient descent solvers for
non-decomposable loss functions. We show that for a large family of loss
functions satisfying a certain uniform convergence property (that includes
prec@k, pAUC, and F-measure), our methods provably converge to the empirical
risk minimizer. Such uniform convergence results were not known for these
losses and we establish these using novel proof techniques. We then use
extensive experimentation on real life and benchmark datasets to establish that
our method can be orders of magnitude faster than a recently proposed cutting
plane method.Comment: 25 pages, 3 figures, To appear in the proceedings of the 28th Annual
Conference on Neural Information Processing Systems, NIPS 201
Achieving Better Regret against Strategic Adversaries
We study online learning problems in which the learner has extra knowledge
about the adversary's behaviour, i.e., in game-theoretic settings where
opponents typically follow some no-external regret learning algorithms. Under
this assumption, we propose two new online learning algorithms, Accurate Follow
the Regularized Leader (AFTRL) and Prod-Best Response (Prod-BR), that
intensively exploit this extra knowledge while maintaining the no-regret
property in the worst-case scenario of having inaccurate extra information.
Specifically, AFTRL achieves external regret or \emph{forward
regret} against no-external regret adversary in comparison with
\emph{dynamic regret} of Prod-BR. To the best of our knowledge, our algorithm
is the first to consider forward regret that achieves regret against
strategic adversaries. When playing zero-sum games with Accurate Multiplicative
Weights Update (AMWU), a special case of AFTRL, we achieve \emph{last round
convergence} to the Nash Equilibrium. We also provide numerical experiments to
further support our theoretical results. In particular, we demonstrate that our
methods achieve significantly better regret bounds and rate of last round
convergence, compared to the state of the art (e.g., Multiplicative Weights
Update (MWU) and its optimistic counterpart, OMWU)