Search CORE

15,261 research outputs found

Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm

Author: Salakhutdinov Ruslan
Srebro Nathan
Publication venue
Publication date: 01/01/2010
Field of study

We show that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly. We introduce a weighted version of the trace-norm regularizer that works well also with non-uniform sampling. Our experimental results demonstrate that the weighted trace-norm regularization indeed yields significant gains on the (highly non-uniformly sampled) Netflix dataset.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

On Symmetric and Asymmetric LSHs for Inner Product Search

Author: Neyshabur Behnam
Srebro Nathan
Publication venue
Publication date: 08/06/2015
Field of study

We consider the problem of designing locality sensitive hashes (LSH) for inner product similarity, and of the power of asymmetric hashes in this context. Shrivastava and Li argue that there is no symmetric LSH for the problem and propose an asymmetric LSH based on different mappings for query and database points. However, we show there does exist a simple symmetric LSH that enjoys stronger guarantees and better empirical performance than the asymmetric LSH they suggest. We also show a variant of the settings where asymmetry is in-fact needed, but there a different asymmetric LSH is required.Comment: 11 pages, 3 figures, In Proceedings of The 32nd International Conference on Machine Learning (ICML

arXiv.org e-Print Archive

CiteSeerX

The Implicit Bias of Gradient Descent on Separable Data

Author: Gunasekar Suriya
Hoffer Elad
Nacson Mor Shpigel
Soudry Daniel
Srebro Nathan
Publication venue
Publication date: 28/12/2018
Field of study

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the max-margin (hard margin SVM) solution. The result also generalizes to other monotone decreasing loss functions with an infimum at infinity, to multi-class problems, and to training a weight layer in a deep network in a certain restricted setting. Furthermore, we show this convergence is very slow, and only logarithmic in the convergence of the loss itself. This can help explain the benefit of continuing to optimize the logistic or cross-entropy loss even after the training error is zero and the training loss is extremely small, and, as we show, even if the validation loss increases. Our methodology can also aid in understanding implicit regularization n more complex models and with other optimization methods.Comment: Final JMLR version, with improved discussions over v3. Main improvements in journal version over conference version (v2 appeared in ICLR): We proved the measure zero case for main theorem (with implications for the rates), and the multi-class cas

arXiv.org e-Print Archive