Search CORE

25,213 research outputs found

PU Learning for Matrix Completion

Author: Dhillon Inderjit S.
Hsieh Cho-Jui
Natarajan Nagarajan
Publication venue
Publication date: 21/11/2014
Field of study

In this paper, we consider the matrix completion problem when the observations are one-bit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social networks where only "likes" or "friendships" are observed. The problem of learning from only positive and unlabeled examples, called PU (positive-unlabeled) learning, has been studied in the context of binary classification. We consider the PU matrix completion problem, where an underlying real-valued matrix M is first quantized to generate one-bit observations and then a subset of positive entries is revealed. Under the assumption that M has bounded nuclear norm, we provide recovery guarantees for two different observation models: 1) M parameterizes a distribution that generates a binary matrix, 2) M is thresholded to obtain a binary matrix. For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix. Both methods yield strong error bounds --- if M is n by n, the Frobenius error is bounded as O(1/((1-rho)n), where 1-rho denotes the fraction of ones observed. This implies a sample complexity of O(n\log n) ones to achieve a small error, when M is dense and n is large. We extend our methods and guarantees to the inductive matrix completion problem, where rows and columns of M have associated features. We provide efficient and scalable optimization procedures for both the methods and demonstrate the effectiveness of the proposed methods for link prediction (on real-world networks consisting of over 2 million nodes and 90 million links) and semi-supervised clustering tasks

arXiv.org e-Print Archive

CiteSeerX

Adaptive Matrix Completion for the Users and the Items in Tail

Author: Karypis George
Sharma Mohit
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Recommender systems are widely used to recommend the most appealing items to users. These recommendations can be generated by applying collaborative filtering methods. The low-rank matrix completion method is the state-of-the-art collaborative filtering method. In this work, we show that the skewed distribution of ratings in the user-item rating matrix of real-world datasets affects the accuracy of matrix-completion-based approaches. Also, we show that the number of ratings that an item or a user has positively correlates with the ability of low-rank matrix-completion-based approaches to predict the ratings for the item or the user accurately. Furthermore, we use these insights to develop four matrix completion-based approaches, i.e., Frequency Adaptive Rating Prediction (FARP), Truncated Matrix Factorization (TMF), Truncated Matrix Factorization with Dropout (TMF + Dropout) and Inverse Frequency Weighted Matrix Factorization (IFWMF), that outperforms traditional matrix-completion-based approaches for the users and the items with few ratings in the user-item rating matrix.Comment: 7 pages, 3 figures, ACM WWW'1

arXiv.org e-Print Archive

Crossref

Guarantees of Riemannian Optimization for Low Rank Matrix Completion

Author: Cai Jian-Feng
Chan Tony F.
Leung Shingyu
Wei Ke
Publication venue
Publication date: 21/03/2016
Field of study

We study the Riemannian optimization methods on the embedded manifold of low rank matrices for the problem of matrix completion, which is about recovering a low rank matrix from its partial entries. Assume

m

entries of an

n\times n

rank

r

matrix are sampled independently and uniformly with replacement. We first prove that with high probability the Riemannian gradient descent and conjugate gradient descent algorithms initialized by one step hard thresholding are guaranteed to converge linearly to the measured matrix provided \begin{align*} m\geq C_\kappa n^{1.5}r\log^{1.5}(n), \end{align*} where

C_\kappa

is a numerical constant depending on the condition number of the underlying matrix. The sampling complexity has been further improved to \begin{align*} m\geq C_\kappa nr^2\log^{2}(n) \end{align*} via the resampled Riemannian gradient descent initialization. The analysis of the new initialization procedure relies on an asymmetric restricted isometry property of the sampling operator and the curvature of the low rank matrix manifold. Numerical simulation shows that the algorithms are able to recover a low rank matrix from nearly the minimum number of measurements

arXiv.org e-Print Archive

eScholarship - University of California