Search CORE

16,427 research outputs found

Second-Order Kernel Online Convex Optimization with Adaptive Sketching

Author: Calandriello Daniele
Lazaric Alessandro
Valko Michal
Publication venue
Publication date: 01/01/2017
Field of study

Kernel online convex optimization (KOCO) is a framework combining the expressiveness of non-parametric kernel models with the regret guarantees of online learning. First-order KOCO methods such as functional gradient descent require only

\mathcal{O}(t)

time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal

\mathcal{O}(\sqrt{T})

regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss posses stronger curvature that can be exploited. In this case, second-order KOCO methods achieve

\mathcal{O}(\log(\text{Det}(\boldsymbol{K})))

regret, which we show scales as

\mathcal{O}(d_{\text{eff}}\log T)

, where

d_{\text{eff}}

is the effective dimension of the problem and is usually much smaller than

\mathcal{O}(\sqrt{T})

. The main drawback of second-order methods is their much higher

\mathcal{O}(t^2)

space and time complexity. In this paper, we introduce kernel online Newton step (KONS), a new second-order KOCO method that also achieves

\mathcal{O}(d_{\text{eff}}\log T)

regret. To address the computational complexity of second-order methods, we introduce a new matrix sketching algorithm for the kernel matrix

\boldsymbol{K}_t

, and show that for a chosen parameter

\gamma \leq 1

our Sketched-KONS reduces the space and time complexity by a factor of

\gamma^2

\mathcal{O}(t^2\gamma^2)

space and time per iteration, while incurring only

1/\gamma

times more regret

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Fast and Guaranteed Tensor Decomposition via Sketching

Author: Anandkumar Animashree
Smola Alexander
Tung Hsiao-Yu
Wang Yining
Publication venue
Publication date: 01/01/2015
Field of study

Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information Processing Systems (NIPS), held at Montreal, Canada in 201

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Caltech Authors

Network Sketching: Exploiting Binary Structure in Deep CNNs

Author: Chen Yurong
Guo Yiwen
Yao Anbang
Zhao Hao
Publication venue
Publication date: 06/06/2017
Field of study

Convolutional neural networks (CNNs) with deep architectures have substantially advanced the state-of-the-art in computer vision tasks. However, deep networks are typically resource-intensive and thus difficult to be deployed on mobile devices. Recently, CNNs with binary weights have shown compelling efficiency to the community, whereas the accuracy of such models is usually unsatisfactory in practice. In this paper, we introduce network sketching as a novel technique of pursuing binary-weight CNNs, targeting at more faithful inference and better trade-off for practical applications. Our basic idea is to exploit binary structure directly in pre-trained filter banks and produce binary-weight models via tensor expansion. The whole process can be treated as a coarse-to-fine model approximation, akin to the pencil drawing steps of outlining and shading. To further speedup the generated models, namely the sketches, we also propose an associative implementation of binary tensor convolutions. Experimental results demonstrate that a proper sketch of AlexNet (or ResNet) outperforms the existing binary-weight models by large margins on the ImageNet large scale classification task, while the committed memory for network parameters only exceeds a little.Comment: To appear in CVPR201

arXiv.org e-Print Archive

Crossref