20,858 research outputs found
An exact sin formula for matrix perturbation analysis and its applications
In this paper, we establish a useful set of formulae for the
distance between the original and the perturbed singular subspaces. These
formulae explicitly show that how the perturbation of the original matrix
propagates into singular vectors and singular subspaces, thus providing a
direct way of analyzing them. Following this, we derive a collection of new
results on SVD perturbation related problems, including a tighter bound on the
norm of the singular vector perturbation errors under
Gaussian noise, a new stability analysis of the Principal Component Analysis
and an error bound on the singular value thresholding operator. For the latter
two, we consider the most general rectangular matrices with full matrix rank
Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices
This paper is concerned with the interplay between statistical asymmetry and
spectral methods. Suppose we are interested in estimating a rank-1 and
symmetric matrix , yet only a
randomly perturbed version is observed. The noise matrix
is composed of zero-mean independent (but not
necessarily homoscedastic) entries and is, therefore, not symmetric in general.
This might arise, for example, when we have two independent samples for each
entry of and arrange them into an {\em asymmetric} data
matrix . The aim is to estimate the leading eigenvalue and
eigenvector of . We demonstrate that the leading eigenvalue
of the data matrix can be times more accurate --- up
to some log factor --- than its (unadjusted) leading singular value in
eigenvalue estimation. Further, the perturbation of any linear form of the
leading eigenvector of --- say, entrywise eigenvector perturbation
--- is provably well-controlled. This eigen-decomposition approach is fully
adaptive to heteroscedasticity of noise without the need of careful bias
correction or any prior knowledge about the noise variance. We also provide
partial theory for the more general rank- case. The takeaway message is
this: arranging the data samples in an asymmetric manner and performing
eigen-decomposition could sometimes be beneficial.Comment: accepted to Annals of Statistics, 2020. 37 page
Disturbance Grassmann Kernels for Subspace-Based Learning
In this paper, we focus on subspace-based learning problems, where data
elements are linear subspaces instead of vectors. To handle this kind of data,
Grassmann kernels were proposed to measure the space structure and used with
classifiers, e.g., Support Vector Machines (SVMs). However, the existing
discriminative algorithms mostly ignore the instability of subspaces, which
would cause the classifiers misled by disturbed instances. Thus we propose
considering all potential disturbance of subspaces in learning processes to
obtain more robust classifiers. Firstly, we derive the dual optimization of
linear classifiers with disturbance subject to a known distribution, resulting
in a new kernel, Disturbance Grassmann (DG) kernel. Secondly, we research into
two kinds of disturbance, relevant to the subspace matrix and singular values
of bases, with which we extend the Projection kernel on Grassmann manifolds to
two new kernels. Experiments on action data indicate that the proposed kernels
perform better compared to state-of-the-art subspace-based methods, even in a
worse environment.Comment: This paper include 3 figures, 10 pages, and has been accpeted to
SIGKDD'1
Beating Randomized Response on Incoherent Matrices
Computing accurate low rank approximations of large matrices is a fundamental
data mining task. In many applications however the matrix contains sensitive
information about individuals. In such case we would like to release a low rank
approximation that satisfies a strong privacy guarantee such as differential
privacy. Unfortunately, to date the best known algorithm for this task that
satisfies differential privacy is based on naive input perturbation or
randomized response: Each entry of the matrix is perturbed independently by a
sufficiently large random noise variable, a low rank approximation is then
computed on the resulting matrix.
We give (the first) significant improvements in accuracy over randomized
response under the natural and necessary assumption that the matrix has low
coherence. Our algorithm is also very efficient and finds a constant rank
approximation of an m x n matrix in time O(mn). Note that even generating the
noise matrix required for randomized response already requires time O(mn)
The Noisy Power Method: A Meta Algorithm with Applications
We provide a new robust convergence analysis of the well-known power method
for computing the dominant singular vectors of a matrix that we call the noisy
power method. Our result characterizes the convergence behavior of the
algorithm when a significant amount noise is introduced after each
matrix-vector multiplication. The noisy power method can be seen as a
meta-algorithm that has recently found a number of important applications in a
broad range of machine learning problems including alternating minimization for
matrix completion, streaming principal component analysis (PCA), and
privacy-preserving spectral analysis. Our general analysis subsumes several
existing ad-hoc convergence bounds and resolves a number of open problems in
multiple applications including streaming PCA and privacy-preserving singular
vector computation.Comment: NIPS 201
- β¦