207 research outputs found
Guaranteed Rank Minimization via Singular Value Projection
Minimizing the rank of a matrix subject to affine constraints is a
fundamental problem with many important applications in machine learning and
statistics. In this paper we propose a simple and fast algorithm SVP (Singular
Value Projection) for rank minimization with affine constraints (ARMP) and show
that SVP recovers the minimum rank solution for affine constraints that satisfy
the "restricted isometry property" and show robustness of our method to noise.
Our results improve upon a recent breakthrough by Recht, Fazel and Parillo
(RFP07) and Lee and Bresler (LB09) in three significant ways:
1) our method (SVP) is significantly simpler to analyze and easier to
implement,
2) we give recovery guarantees under strictly weaker isometry assumptions
3) we give geometric convergence guarantees for SVP even in presense of noise
and, as demonstrated empirically, SVP is significantly faster on real-world and
synthetic problems.
In addition, we address the practically important problem of low-rank matrix
completion (MCP), which can be seen as a special case of ARMP. We empirically
demonstrate that our algorithm recovers low-rank incoherent matrices from an
almost optimal number of uniformly sampled entries. We make partial progress
towards proving exact recovery and provide some intuition for the strong
performance of SVP applied to matrix completion by showing a more restricted
isometry property. Our algorithm outperforms existing methods, such as those of
\cite{RFP07,CR08,CT09,CCS08,KOM09,LB09}, for ARMP and the matrix-completion
problem by an order of magnitude and is also significantly more robust to
noise.Comment: An earlier version of this paper was submitted to NIPS-2009 on June
5, 200
Multi-Scale Link Prediction
The automated analysis of social networks has become an important problem due
to the proliferation of social networks, such as LiveJournal, Flickr and
Facebook. The scale of these social networks is massive and continues to grow
rapidly. An important problem in social network analysis is proximity
estimation that infers the closeness of different users. Link prediction, in
turn, is an important application of proximity estimation. However, many
methods for computing proximity measures have high computational complexity and
are thus prohibitive for large-scale link prediction problems. One way to
address this problem is to estimate proximity measures via low-rank
approximation. However, a single low-rank approximation may not be sufficient
to represent the behavior of the entire network. In this paper, we propose
Multi-Scale Link Prediction (MSLP), a framework for link prediction, which can
handle massive networks. The basis idea of MSLP is to construct low rank
approximations of the network at multiple scales in an efficient manner. Based
on this approach, MSLP combines predictions at multiple scales to make robust
and accurate predictions. Experimental results on real-life datasets with more
than a million nodes show the superior performance and scalability of our
method.Comment: 20 pages, 10 figure
A Divide-and-Conquer Solver for Kernel Support Vector Machines
The kernel support vector machine (SVM) is one of the most widely used
classification methods; however, the amount of computation required becomes the
bottleneck when facing millions of samples. In this paper, we propose and
analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the
division step, we partition the kernel SVM problem into smaller subproblems by
clustering the data, so that each subproblem can be solved independently and
efficiently. We show theoretically that the support vectors identified by the
subproblem solution are likely to be support vectors of the entire kernel SVM
problem, provided that the problem is partitioned appropriately by kernel
clustering. In the conquer step, the local solutions from the subproblems are
used to initialize a global coordinate descent solver, which converges quickly
as suggested by our analysis. By extending this idea, we develop a multilevel
Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction
strategy, which outperforms state-of-the-art methods in terms of training
speed, testing accuracy, and memory usage. As an example, on the covtype
dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in
obtaining the exact SVM solution (to within relative error) which
achieves 96.15% prediction accuracy. Moreover, with our proposed early
prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes,
which is more than 100 times faster than LIBSVM
- β¦