422 research outputs found
Learning with SGD and Random Features
Sketching and stochastic gradient methods are arguably the most common
techniques to derive efficient large scale learning algorithms. In this paper,
we investigate their application in the context of nonparametric statistical
learning. More precisely, we study the estimator defined by stochastic gradient
with mini batches and random features. The latter can be seen as form of
nonlinear sketching and used to define approximate kernel methods. The
considered estimator is not explicitly penalized/constrained and regularization
is implicit. Indeed, our study highlights how different parameters, such as
number of features, iterations, step-size and mini-batch size control the
learning properties of the solutions. We do this by deriving optimal finite
sample bounds, under standard assumptions. The obtained results are
corroborated and illustrated by numerical experiments
Second-Order Kernel Online Convex Optimization with Adaptive Sketching
Kernel online convex optimization (KOCO) is a framework combining the
expressiveness of non-parametric kernel models with the regret guarantees of
online learning. First-order KOCO methods such as functional gradient descent
require only time and space per iteration, and, when the only
information on the losses is their convexity, achieve a minimax optimal
regret. Nonetheless, many common losses in kernel
problems, such as squared loss, logistic loss, and squared hinge loss posses
stronger curvature that can be exploited. In this case, second-order KOCO
methods achieve regret, which
we show scales as , where
is the effective dimension of the problem and is usually much smaller than
. The main drawback of second-order methods is their
much higher space and time complexity. In this paper, we
introduce kernel online Newton step (KONS), a new second-order KOCO method that
also achieves regret. To address the
computational complexity of second-order methods, we introduce a new matrix
sketching algorithm for the kernel matrix , and show that for
a chosen parameter our Sketched-KONS reduces the space and time
complexity by a factor of to space and
time per iteration, while incurring only times more regret
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
Gaussian processes (GP) are a well studied Bayesian approach for the
optimization of black-box functions. Despite their effectiveness in simple
problems, GP-based algorithms hardly scale to high-dimensional functions, as
their per-iteration time and space cost is at least quadratic in the number of
dimensions and iterations . Given a set of alternatives to choose
from, the overall runtime is prohibitive. In this paper we introduce
BKB (budgeted kernelized bandit), a new approximate GP algorithm for
optimization under bandit feedback that achieves near-optimal regret (and hence
near-optimal convergence rate) with near-constant per-iteration complexity and
remarkably no assumption on the input space or covariance of the GP.
We combine a kernelized linear bandit algorithm (GP-UCB) with randomized
matrix sketching based on leverage score sampling, and we prove that randomly
sampling inducing points based on their posterior variance gives an accurate
low-rank approximation of the GP, preserving variance estimates and confidence
intervals. As a consequence, BKB does not suffer from variance starvation, an
important problem faced by many previous sparse GP approximations. Moreover, we
show that our procedure selects at most points, where
is the effective dimension of the explored space, which is typically
much smaller than both and . This greatly reduces the dimensionality of
the problem, thus leading to a runtime and
space complexity.Comment: Accepted at COLT 2019. Corrected typos and improved comparison with
existing method
Reward Imputation with Sketching for Contextual Batched Bandits
Contextual batched bandit (CBB) is a setting where a batch of rewards is
observed from the environment at the end of each episode, but the rewards of
the non-executed actions are unobserved, resulting in partial-information
feedback. Existing approaches for CBB often ignore the rewards of the
non-executed actions, leading to underutilization of feedback information. In
this paper, we propose an efficient approach called Sketched Policy Updating
with Imputed Rewards (SPUIR) that completes the unobserved rewards using
sketching, which approximates the full-information feedbacks. We formulate
reward imputation as an imputation regularized ridge regression problem that
captures the feedback mechanisms of both executed and non-executed actions. To
reduce time complexity, we solve the regression problem using randomized
sketching. We prove that our approach achieves an instantaneous regret with
controllable bias and smaller variance than approaches without reward
imputation. Furthermore, our approach enjoys a sublinear regret bound against
the optimal policy. We also present two extensions, a rate-scheduled version
and a version for nonlinear rewards, making our approach more practical.
Experimental results show that SPUIR outperforms state-of-the-art baselines on
synthetic, public benchmark, and real-world datasets.Comment: Accepted by NeurIPS 202
- …