21 research outputs found
Approximating a Gram Matrix for Improved Kernel-Based Learning
A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n ), where n is the number of training examples. We develop and analyze an algorithm to compute an easily-interpretable low-rank approximation to an n Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form C is a matrix consisting of a small number c of columns of G and Wk between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciously-chosen and data-dependent nonuniform probability distribution. Let F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let Gk be the best rank-k approximation to G. We prove that by choosing O(k/# ) columns G CW # # #G -Gk# ii , both in expectation and with high probability, for both # = 2, F , and for all k : 0 rank(W ). This approximation can be computed using O(n) additional space and time, after making two passes over the data from external storage