44 research outputs found
Random Projections For Large-Scale Regression
Fitting linear regression models can be computationally very expensive in
large-scale data analysis tasks if the sample size and the number of variables
are very large. Random projections are extensively used as a dimension
reduction tool in machine learning and statistics. We discuss the applications
of random projections in linear regression problems, developed to decrease
computational costs, and give an overview of the theoretical guarantees of the
generalization error. It can be shown that the combination of random
projections with least squares regression leads to similar recovery as ridge
regression and principal component regression. We also discuss possible
improvements when averaging over multiple random projections, an approach that
lends itself easily to parallel implementation.Comment: 13 pages, 3 Figure
Privacy-Utility Trade-off of Linear Regression under Random Projections and Additive Noise
Data privacy is an important concern in machine learning, and is
fundamentally at odds with the task of training useful learning models, which
typically require the acquisition of large amounts of private user data. One
possible way of fulfilling the machine learning task while preserving user
privacy is to train the model on a transformed, noisy version of the data,
which does not reveal the data itself directly to the training procedure. In
this work, we analyze the privacy-utility trade-off of two such schemes for the
problem of linear regression: additive noise, and random projections. In
contrast to previous work, we consider a recently proposed notion of
differential privacy that is based on conditional mutual information (MI-DP),
which is stronger than the conventional -differential
privacy, and use relative objective error as the utility metric. We find that
projecting the data to a lower-dimensional subspace before adding noise attains
a better trade-off in general. We also make a connection between privacy
problem and (non-coherent) SIMO, which has been extensively studied in wireless
communication, and use tools from there for the analysis. We present numerical
results demonstrating the performance of the schemes.Comment: A short version is published in ISIT 201
The Noisy Power Method: A Meta Algorithm with Applications
We provide a new robust convergence analysis of the well-known power method
for computing the dominant singular vectors of a matrix that we call the noisy
power method. Our result characterizes the convergence behavior of the
algorithm when a significant amount noise is introduced after each
matrix-vector multiplication. The noisy power method can be seen as a
meta-algorithm that has recently found a number of important applications in a
broad range of machine learning problems including alternating minimization for
matrix completion, streaming principal component analysis (PCA), and
privacy-preserving spectral analysis. Our general analysis subsumes several
existing ad-hoc convergence bounds and resolves a number of open problems in
multiple applications including streaming PCA and privacy-preserving singular
vector computation.Comment: NIPS 201