422 research outputs found

    Learning with SGD and Random Features

    Get PDF
    Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments

    Second-Order Kernel Online Convex Optimization with Adaptive Sketching

    Get PDF
    Kernel online convex optimization (KOCO) is a framework combining the expressiveness of non-parametric kernel models with the regret guarantees of online learning. First-order KOCO methods such as functional gradient descent require only O(t)\mathcal{O}(t) time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal O(T)\mathcal{O}(\sqrt{T}) regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss posses stronger curvature that can be exploited. In this case, second-order KOCO methods achieve O(log(Det(K)))\mathcal{O}(\log(\text{Det}(\boldsymbol{K}))) regret, which we show scales as O(defflogT)\mathcal{O}(d_{\text{eff}}\log T), where deffd_{\text{eff}} is the effective dimension of the problem and is usually much smaller than O(T)\mathcal{O}(\sqrt{T}). The main drawback of second-order methods is their much higher O(t2)\mathcal{O}(t^2) space and time complexity. In this paper, we introduce kernel online Newton step (KONS), a new second-order KOCO method that also achieves O(defflogT)\mathcal{O}(d_{\text{eff}}\log T) regret. To address the computational complexity of second-order methods, we introduce a new matrix sketching algorithm for the kernel matrix Kt\boldsymbol{K}_t, and show that for a chosen parameter γ1\gamma \leq 1 our Sketched-KONS reduces the space and time complexity by a factor of γ2\gamma^2 to O(t2γ2)\mathcal{O}(t^2\gamma^2) space and time per iteration, while incurring only 1/γ1/\gamma times more regret

    Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret

    Full text link
    Gaussian processes (GP) are a well studied Bayesian approach for the optimization of black-box functions. Despite their effectiveness in simple problems, GP-based algorithms hardly scale to high-dimensional functions, as their per-iteration time and space cost is at least quadratic in the number of dimensions dd and iterations tt. Given a set of AA alternatives to choose from, the overall runtime O(t3A)O(t^3A) is prohibitive. In this paper we introduce BKB (budgeted kernelized bandit), a new approximate GP algorithm for optimization under bandit feedback that achieves near-optimal regret (and hence near-optimal convergence rate) with near-constant per-iteration complexity and remarkably no assumption on the input space or covariance of the GP. We combine a kernelized linear bandit algorithm (GP-UCB) with randomized matrix sketching based on leverage score sampling, and we prove that randomly sampling inducing points based on their posterior variance gives an accurate low-rank approximation of the GP, preserving variance estimates and confidence intervals. As a consequence, BKB does not suffer from variance starvation, an important problem faced by many previous sparse GP approximations. Moreover, we show that our procedure selects at most O~(deff)\tilde{O}(d_{eff}) points, where deffd_{eff} is the effective dimension of the explored space, which is typically much smaller than both dd and tt. This greatly reduces the dimensionality of the problem, thus leading to a O(TAdeff2)O(TAd_{eff}^2) runtime and O(Adeff)O(A d_{eff}) space complexity.Comment: Accepted at COLT 2019. Corrected typos and improved comparison with existing method

    Reward Imputation with Sketching for Contextual Batched Bandits

    Full text link
    Contextual batched bandit (CBB) is a setting where a batch of rewards is observed from the environment at the end of each episode, but the rewards of the non-executed actions are unobserved, resulting in partial-information feedback. Existing approaches for CBB often ignore the rewards of the non-executed actions, leading to underutilization of feedback information. In this paper, we propose an efficient approach called Sketched Policy Updating with Imputed Rewards (SPUIR) that completes the unobserved rewards using sketching, which approximates the full-information feedbacks. We formulate reward imputation as an imputation regularized ridge regression problem that captures the feedback mechanisms of both executed and non-executed actions. To reduce time complexity, we solve the regression problem using randomized sketching. We prove that our approach achieves an instantaneous regret with controllable bias and smaller variance than approaches without reward imputation. Furthermore, our approach enjoys a sublinear regret bound against the optimal policy. We also present two extensions, a rate-scheduled version and a version for nonlinear rewards, making our approach more practical. Experimental results show that SPUIR outperforms state-of-the-art baselines on synthetic, public benchmark, and real-world datasets.Comment: Accepted by NeurIPS 202
    corecore