1 research outputs found

    Gaussian Sketching yields a J-L Lemma in RKHS

    Full text link
    The main contribution of the paper is to show that Gaussian sketching of a kernel-Gram matrix K\boldsymbol K yields an operator whose counterpart in an RKHS H\mathcal H, is a \emph{random projection} operator---in the spirit of Johnson-Lindenstrauss (J-L) lemma. To be precise, given a random matrix ZZ with i.i.d. Gaussian entries, we show that a sketch ZKZ\boldsymbol{K} corresponds to a particular random operator in (infinite-dimensional) Hilbert space H\mathcal H that maps functions f∈Hf \in \mathcal H to a low-dimensional space Rd\mathbb R^d, while preserving a weighted RKHS inner-product of the form ⟨f,gβŸ©Ξ£β‰βŸ¨f,Ξ£3g⟩H\langle f, g \rangle_{\Sigma} \doteq \langle f, \Sigma^3 g \rangle_{\mathcal H}, where Ξ£\Sigma is the \emph{covariance} operator induced by the data distribution. In particular, under similar assumptions as in kernel PCA (KPCA), or kernel kk-means (K-kk-means), well-separated subsets of feature-space {K(β‹…,x):x∈X}\{K(\cdot, x): x \in \cal X\} remain well-separated after such operation, which suggests similar benefits as in KPCA and/or K-kk-means, albeit at the much cheaper cost of a random projection. In particular, our convergence rates suggest that, given a large dataset {Xi}i=1N\{X_i\}_{i=1}^N of size NN, we can build the Gram matrix K\boldsymbol K on a much smaller subsample of size nβ‰ͺNn\ll N, so that the sketch ZKZ\boldsymbol K is very cheap to obtain and subsequently apply as a projection operator on the original data {Xi}i=1N\{X_i\}_{i=1}^N. We verify these insights empirically on synthetic data, and on real-world clustering applications.Comment: 16 page
    corecore