99 research outputs found
A Matrix Hyperbolic Cosine Algorithm and Applications
In this paper, we generalize Spencer's hyperbolic cosine algorithm to the
matrix-valued setting. We apply the proposed algorithm to several problems by
analyzing its computational efficiency under two special cases of matrices; one
in which the matrices have a group structure and an other in which they have
rank-one. As an application of the former case, we present a deterministic
algorithm that, given the multiplication table of a finite group of size ,
it constructs an expanding Cayley graph of logarithmic degree in near-optimal
O(n^2 log^3 n) time. For the latter case, we present a fast deterministic
algorithm for spectral sparsification of positive semi-definite matrices, which
implies an improved deterministic algorithm for spectral graph sparsification
of dense graphs. In addition, we give an elementary connection between spectral
sparsification of positive semi-definite matrices and element-wise matrix
sparsification. As a consequence, we obtain improved element-wise
sparsification algorithms for diagonally dominant-like matrices.Comment: 16 pages, simplified proof and corrected acknowledging of prior work
in (current) Section
Block CUR: Decomposing Matrices using Groups of Columns
A common problem in large-scale data analysis is to approximate a matrix
using a combination of specifically sampled rows and columns, known as CUR
decomposition. Unfortunately, in many real-world environments, the ability to
sample specific individual rows or columns of the matrix is limited by either
system constraints or cost. In this paper, we consider matrix approximation by
sampling predefined \emph{blocks} of columns (or rows) from the matrix. We
present an algorithm for sampling useful column blocks and provide novel
guarantees for the quality of the approximation. This algorithm has application
in problems as diverse as biometric data analysis to distributed computing. We
demonstrate the effectiveness of the proposed algorithms for computing the
Block CUR decomposition of large matrices in a distributed setting with
multiple nodes in a compute cluster, where such blocks correspond to columns
(or rows) of the matrix stored on the same node, which can be retrieved with
much less overhead than retrieving individual columns stored across different
nodes. In the biometric setting, the rows correspond to different users and
columns correspond to users' biometric reaction to external stimuli, {\em
e.g.,}~watching video content, at a particular time instant. There is
significant cost in acquiring each user's reaction to lengthy content so we
sample a few important scenes to approximate the biometric response. An
individual time sample in this use case cannot be queried in isolation due to
the lack of context that caused that biometric reaction. Instead, collections
of time segments ({\em i.e.,} blocks) must be presented to the user. The
practical application of these algorithms is shown via experimental results
using real-world user biometric data from a content testing environment.Comment: shorter version to appear in ECML-PKDD 201
Small ball probability, Inverse theorems, and applications
Let be a real random variable with mean zero and variance one and
be a multi-set in . The random sum
where are iid copies of
is of fundamental importance in probability and its applications.
We discuss the small ball problem, the aim of which is to estimate the
maximum probability that belongs to a ball with given small radius,
following the discovery made by Littlewood-Offord and Erdos almost 70 years
ago. We will mainly focus on recent developments that characterize the
structure of those sets where the small ball probability is relatively
large. Applications of these results include full solutions or significant
progresses of many open problems in different areas.Comment: 47 page
On the linear independence of spikes and sines
The purpose of this work is to survey what is known about the linear
independence of spikes and sines. The paper provides new results for the case
where the locations of the spikes and the frequencies of the sines are chosen
at random. This problem is equivalent to studying the spectral norm of a random
submatrix drawn from the discrete Fourier transform matrix. The proof involves
depends on an extrapolation argument of Bourgain and Tzafriri.Comment: 16 pages, 4 figures. Revision with new proof of major theorem
On the nontrivial projection problem
The Nontrivial Projection Problem asks whether every finite-dimensional
normed space of dimension greater than one admits a well-bounded projection of
non-trivial rank and corank or, equivalently, whether every centrally symmetric
convex body (of arbitrary dimension greater than one) is approximately affinely
equivalent to a direct product of two bodies of non-trivial dimension. We show
that this is true "up to a logarithmic factor."Comment: 17 page
Sparsity and Incoherence in Compressive Sampling
We consider the problem of reconstructing a sparse signal from a
limited number of linear measurements. Given randomly selected samples of
, where is an orthonormal matrix, we show that minimization
recovers exactly when the number of measurements exceeds where is the number of
nonzero components in , and is the largest entry in properly
normalized: . The smaller ,
the fewer samples needed.
The result holds for ``most'' sparse signals supported on a fixed (but
arbitrary) set . Given , if the sign of for each nonzero entry on
and the observed values of are drawn at random, the signal is
recovered with overwhelming probability. Moreover, there is a sense in which
this is nearly optimal since any method succeeding with the same probability
would require just about this many samples
Quantization and Compressive Sensing
Quantization is an essential step in digitizing signals, and, therefore, an
indispensable component of any modern acquisition system. This book chapter
explores the interaction of quantization and compressive sensing and examines
practical quantization strategies for compressive acquisition systems.
Specifically, we first provide a brief overview of quantization and examine
fundamental performance bounds applicable to any quantization approach. Next,
we consider several forms of scalar quantizers, namely uniform, non-uniform,
and 1-bit. We provide performance bounds and fundamental analysis, as well as
practical quantizer designs and reconstruction algorithms that account for
quantization. Furthermore, we provide an overview of Sigma-Delta
() quantization in the compressed sensing context, and also
discuss implementation issues, recovery algorithms and performance bounds. As
we demonstrate, proper accounting for quantization and careful quantizer design
has significant impact in the performance of a compressive acquisition system.Comment: 35 pages, 20 figures, to appear in Springer book "Compressed Sensing
and Its Applications", 201
On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation
We study classic streaming and sparse recovery problems using deterministic
linear sketches, including l1/l1 and linf/l1 sparse recovery problems (the
latter also being known as l1-heavy hitters), norm estimation, and approximate
inner product. We focus on devising a fixed matrix A in R^{m x n} and a
deterministic recovery/estimation procedure which work for all possible input
vectors simultaneously. Our results improve upon existing work, the following
being our main contributions:
* A proof that linf/l1 sparse recovery and inner product estimation are
equivalent, and that incoherent matrices can be used to solve both problems.
Our upper bound for the number of measurements is m=O(eps^{-2}*min{log n, (log
n / log(1/eps))^2}). We can also obtain fast sketching and recovery algorithms
by making use of the Fast Johnson-Lindenstrauss transform. Both our running
times and number of measurements improve upon previous work. We can also obtain
better error guarantees than previous work in terms of a smaller tail of the
input vector.
* A new lower bound for the number of linear measurements required to solve
l1/l1 sparse recovery. We show Omega(k/eps^2 + klog(n/k)/eps) measurements are
required to recover an x' with |x - x'|_1 <= (1+eps)|x_{tail(k)}|_1, where
x_{tail(k)} is x projected onto all but its largest k coordinates in magnitude.
* A tight bound of m = Theta(eps^{-2}log(eps^2 n)) on the number of
measurements required to solve deterministic norm estimation, i.e., to recover
|x|_2 +/- eps|x|_1.
For all the problems we study, tight bounds are already known for the
randomized complexity from previous work, except in the case of l1/l1 sparse
recovery, where a nearly tight bound is known. Our work thus aims to study the
deterministic complexities of these problems
Restricted Isometries for Partial Random Circulant Matrices
In the theory of compressed sensing, restricted isometry analysis has become
a standard tool for studying how efficiently a measurement matrix acquires
information about sparse and compressible signals. Many recovery algorithms are
known to succeed when the restricted isometry constants of the sampling matrix
are small. Many potential applications of compressed sensing involve a
data-acquisition process that proceeds by convolution with a random pulse
followed by (nonrandom) subsampling. At present, the theoretical analysis of
this measurement technique is lacking. This paper demonstrates that the th
order restricted isometry constant is small when the number of samples
satisfies , where is the length of the pulse.
This bound improves on previous estimates, which exhibit quadratic scaling
Spectrum of non-Hermitian heavy tailed random matrices
Let (X_{jk})_{j,k>=1} be i.i.d. complex random variables such that |X_{jk}|
is in the domain of attraction of an alpha-stable law, with 0< alpha <2. Our
main result is a heavy tailed counterpart of Girko's circular law. Namely,
under some additional smoothness assumptions on the law of X_{jk}, we prove
that there exists a deterministic sequence a_n ~ n^{1/alpha} and a probability
measure mu_alpha on C depending only on alpha such that with probability one,
the empirical distribution of the eigenvalues of the rescaled matrix a_n^{-1}
(X_{jk})_{1<=j,k<=n} converges weakly to mu_alpha as n tends to infinity. Our
approach combines Aldous & Steele's objective method with Girko's Hermitization
using logarithmic potentials. The underlying limiting object is defined on a
bipartized version of Aldous' Poisson Weighted Infinite Tree. Recursive
relations on the tree provide some properties of mu_alpha. In contrast with the
Hermitian case, we find that mu_alpha is not heavy tailed.Comment: Expanded version of a paper published in Communications in
Mathematical Physics 307, 513-560 (2011
- …