174 research outputs found

    Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix

    Get PDF
    This paper studies how to sketch element-wise functions of low-rank matrices. Formally, given low-rank matrix A = [Aij] and scalar non-linear function f, we aim for finding an approximated low-rank representation of the (possibly high-rank) matrix [f(Aij)]. To this end, we propose an efficient sketching-based algorithm whose complexity is significantly lower than the number of entries of A, i.e., it runs without accessing all entries of [f(Aij)] explicitly. The main idea underlying our method is to combine a polynomial approximation of f with the existing tensor sketch scheme for approximating monomials of entries of A. To balance the errors of the two approximation components in an optimal manner, we propose a novel regression formula to find polynomial coefficients given A and f. In particular, we utilize a coreset-based regression with a rigorous approximation guarantee. Finally, we demonstrate the applicability and superiority of the proposed scheme under various machine learning tasks

    Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

    Get PDF
    We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

    Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models

    Full text link
    In the context of kernel machines, polynomial and Fourier features are commonly used to provide a nonlinear extension to linear models by mapping the data to a higher-dimensional space. Unless one considers the dual formulation of the learning problem, which renders exact large-scale learning unfeasible, the exponential increase of model parameters in the dimensionality of the data caused by their tensor-product structure prohibits to tackle high-dimensional problems. One of the possible approaches to circumvent this exponential scaling is to exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network. In this paper we quantize, i.e. further tensorize, polynomial and Fourier features. Based on this feature quantization we propose to quantize the associated model weights, yielding quantized models. We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts, at no additional computational cost while learning from identical features. We verify experimentally how this additional tensorization regularizes the learning problem by prioritizing the most salient features in the data and how it provides models with increased generalization capabilities. We finally benchmark our approach on large regression task, achieving state-of-the-art results on a laptop computer

    Complex-to-Real Random Features for Polynomial Kernels

    Full text link
    Polynomial kernels are among the most popular kernels in machine learning, since their feature maps model the interactions between the dimensions of the input data. However, these features correspond to tensor products of the input with itself, which makes their dimension grow exponentially with the polynomial degree. We address this issue by proposing Complexto-Real (CtR) sketches for tensor products that can be used as random feature approximations of polynomial kernels. These sketches leverage intermediate complex random projections, leading to better theoretical guarantees and potentially much lower variances than analogs using real projections. Our sketches are simple to construct and their final output is real-valued, which makes their downstream use straightforward. Finally, we show that they achieve state-of-the-art performance in terms of accuracy and speed.Comment: 33 page

    Toward a unified theory of sparse dimensionality reduction in Euclidean space

    Get PDF
    Let ΦRm×n\Phi\in\mathbb{R}^{m\times n} be a sparse Johnson-Lindenstrauss transform [KN14] with ss non-zeroes per column. For a subset TT of the unit sphere, ε(0,1/2)\varepsilon\in(0,1/2) given, we study settings for m,sm,s required to ensure EΦsupxTΦx221<ε, \mathop{\mathbb{E}}_\Phi \sup_{x\in T} \left|\|\Phi x\|_2^2 - 1 \right| < \varepsilon , i.e. so that Φ\Phi preserves the norm of every xTx\in T simultaneously and multiplicatively up to 1+ε1+\varepsilon. We introduce a new complexity parameter, which depends on the geometry of TT, and show that it suffices to choose ss and mm such that this parameter is small. Our result is a sparse analog of Gordon's theorem, which was concerned with a dense Φ\Phi having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson-Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso

    Faster Linear Algebra for Distance Matrices

    Full text link
    The distance matrix of a dataset XX of nn points with respect to a distance function ff represents all pairwise distances between points in XX induced by ff. Due to their wide applicability, distance matrices and related families of matrices have been the focus of many recent algorithmic works. We continue this line of research and take a broad view of algorithm design for distance matrices with the goal of designing fast algorithms, which are specifically tailored for distance matrices, for fundamental linear algebraic primitives. Our results include efficient algorithms for computing matrix-vector products for a wide class of distance matrices, such as the 1\ell_1 metric for which we get a linear runtime, as well as an Ω(n2)\Omega(n^2) lower bound for any algorithm which computes a matrix-vector product for the \ell_{\infty} case, showing a separation between the 1\ell_1 and the \ell_{\infty} metrics. Our upper bound results, in conjunction with recent works on the matrix-vector query model, have many further downstream applications, including the fastest algorithm for computing a relative error low-rank approximation for the distance matrix induced by 1\ell_1 and 22\ell_2^2 functions and the fastest algorithm for computing an additive error low-rank approximation for the 2\ell_2 metric, in addition to applications for fast matrix multiplication among others. We also give algorithms for constructing distance matrices and show that one can construct an approximate 2\ell_2 distance matrix in time faster than the bound implied by the Johnson-Lindenstrauss lemma.Comment: Selected as Oral for NeurIPS 202
    corecore