174 research outputs found
Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix
This paper studies how to sketch element-wise functions of low-rank matrices.
Formally, given low-rank matrix A = [Aij] and scalar non-linear function f, we
aim for finding an approximated low-rank representation of the (possibly
high-rank) matrix [f(Aij)]. To this end, we propose an efficient
sketching-based algorithm whose complexity is significantly lower than the
number of entries of A, i.e., it runs without accessing all entries of [f(Aij)]
explicitly. The main idea underlying our method is to combine a polynomial
approximation of f with the existing tensor sketch scheme for approximating
monomials of entries of A. To balance the errors of the two approximation
components in an optimal manner, we propose a novel regression formula to find
polynomial coefficients given A and f. In particular, we utilize a
coreset-based regression with a rigorous approximation guarantee. Finally, we
demonstrate the applicability and superiority of the proposed scheme under
various machine learning tasks
Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms
We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques
Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models
In the context of kernel machines, polynomial and Fourier features are
commonly used to provide a nonlinear extension to linear models by mapping the
data to a higher-dimensional space. Unless one considers the dual formulation
of the learning problem, which renders exact large-scale learning unfeasible,
the exponential increase of model parameters in the dimensionality of the data
caused by their tensor-product structure prohibits to tackle high-dimensional
problems. One of the possible approaches to circumvent this exponential scaling
is to exploit the tensor structure present in the features by constraining the
model weights to be an underparametrized tensor network. In this paper we
quantize, i.e. further tensorize, polynomial and Fourier features. Based on
this feature quantization we propose to quantize the associated model weights,
yielding quantized models. We show that, for the same number of model
parameters, the resulting quantized models have a higher bound on the
VC-dimension as opposed to their non-quantized counterparts, at no additional
computational cost while learning from identical features. We verify
experimentally how this additional tensorization regularizes the learning
problem by prioritizing the most salient features in the data and how it
provides models with increased generalization capabilities. We finally
benchmark our approach on large regression task, achieving state-of-the-art
results on a laptop computer
Complex-to-Real Random Features for Polynomial Kernels
Polynomial kernels are among the most popular kernels in machine learning,
since their feature maps model the interactions between the dimensions of the
input data. However, these features correspond to tensor products of the input
with itself, which makes their dimension grow exponentially with the polynomial
degree.
We address this issue by proposing Complexto-Real (CtR) sketches for tensor
products that can be used as random feature approximations of polynomial
kernels. These sketches leverage intermediate complex random projections,
leading to better theoretical guarantees and potentially much lower variances
than analogs using real projections. Our sketches are simple to construct and
their final output is real-valued, which makes their downstream use
straightforward. Finally, we show that they achieve state-of-the-art
performance in terms of accuracy and speed.Comment: 33 page
Toward a unified theory of sparse dimensionality reduction in Euclidean space
Let be a sparse Johnson-Lindenstrauss
transform [KN14] with non-zeroes per column. For a subset of the unit
sphere, given, we study settings for required to
ensure i.e. so that preserves the norm of every
simultaneously and multiplicatively up to . We
introduce a new complexity parameter, which depends on the geometry of , and
show that it suffices to choose and such that this parameter is small.
Our result is a sparse analog of Gordon's theorem, which was concerned with a
dense having i.i.d. Gaussian entries. We qualitatively unify several
results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and
Fourier-based restricted isometries. Our work also implies new results in using
the sparse Johnson-Lindenstrauss transform in numerical linear algebra,
classical and model-based compressed sensing, manifold learning, and
constrained least squares problems such as the Lasso
Faster Linear Algebra for Distance Matrices
The distance matrix of a dataset of points with respect to a distance
function represents all pairwise distances between points in induced by
. Due to their wide applicability, distance matrices and related families of
matrices have been the focus of many recent algorithmic works. We continue this
line of research and take a broad view of algorithm design for distance
matrices with the goal of designing fast algorithms, which are specifically
tailored for distance matrices, for fundamental linear algebraic primitives.
Our results include efficient algorithms for computing matrix-vector products
for a wide class of distance matrices, such as the metric for which we
get a linear runtime, as well as an lower bound for any algorithm
which computes a matrix-vector product for the case, showing a
separation between the and the metrics. Our upper
bound results, in conjunction with recent works on the matrix-vector query
model, have many further downstream applications, including the fastest
algorithm for computing a relative error low-rank approximation for the
distance matrix induced by and functions and the fastest
algorithm for computing an additive error low-rank approximation for the
metric, in addition to applications for fast matrix multiplication
among others. We also give algorithms for constructing distance matrices and
show that one can construct an approximate distance matrix in time
faster than the bound implied by the Johnson-Lindenstrauss lemma.Comment: Selected as Oral for NeurIPS 202
- …