285 research outputs found
Time for dithering: fast and quantized random embeddings via the restricted isometry property
Recently, many works have focused on the characterization of non-linear
dimensionality reduction methods obtained by quantizing linear embeddings,
e.g., to reach fast processing time, efficient data compression procedures,
novel geometry-preserving embeddings or to estimate the information/bits stored
in this reduced data representation. In this work, we prove that many linear
maps known to respect the restricted isometry property (RIP) can induce a
quantized random embedding with controllable multiplicative and additive
distortions with respect to the pairwise distances of the data points beings
considered. In other words, linear matrices having fast matrix-vector
multiplication algorithms (e.g., based on partial Fourier ensembles or on the
adjacency matrix of unbalanced expanders) can be readily used in the definition
of fast quantized embeddings with small distortions. This implication is made
possible by applying right after the linear map an additive and random "dither"
that stabilizes the impact of the uniform scalar quantization operator applied
afterwards. For different categories of RIP matrices, i.e., for different
linear embeddings of a metric space
in with , we derive upper bounds on the
additive distortion induced by quantization, showing that it decays either when
the embedding dimension increases or when the distance of a pair of
embedded vectors in decreases. Finally, we develop a novel
"bi-dithered" quantization scheme, which allows for a reduced distortion that
decreases when the embedding dimension grows and independently of the
considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization,
dither, restricted isometry property, dimensionality reduction, compressive
sensing, low-complexity signal models, fast and structured sensing matrices,
quantized rank-one projections (31 pages
Quantized Compressive K-Means
The recent framework of compressive statistical learning aims at designing
tractable learning algorithms that use only a heavily compressed
representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such
a method: it estimates the centroids of data clusters from pooled, non-linear,
random signatures of the learning examples. While this approach significantly
reduces computational time on very large datasets, its digital implementation
wastes acquisition resources because the learning examples are compressed only
after the sensing stage. The present work generalizes the sketching procedure
initially defined in Compressive K-Means to a large class of periodic
nonlinearities including hardware-friendly implementations that compressively
acquire entire datasets. This idea is exemplified in a Quantized Compressive
K-Means procedure, a variant of CKM that leverages 1-bit universal quantization
(i.e. retaining the least significant bit of a standard uniform quantizer) as
the periodic sketch nonlinearity. Trading for this resource-efficient signature
(standard in most acquisition schemes) has almost no impact on the clustering
performances, as illustrated by numerical experiments
Quantization and Compressive Sensing
Quantization is an essential step in digitizing signals, and, therefore, an
indispensable component of any modern acquisition system. This book chapter
explores the interaction of quantization and compressive sensing and examines
practical quantization strategies for compressive acquisition systems.
Specifically, we first provide a brief overview of quantization and examine
fundamental performance bounds applicable to any quantization approach. Next,
we consider several forms of scalar quantizers, namely uniform, non-uniform,
and 1-bit. We provide performance bounds and fundamental analysis, as well as
practical quantizer designs and reconstruction algorithms that account for
quantization. Furthermore, we provide an overview of Sigma-Delta
() quantization in the compressed sensing context, and also
discuss implementation issues, recovery algorithms and performance bounds. As
we demonstrate, proper accounting for quantization and careful quantizer design
has significant impact in the performance of a compressive acquisition system.Comment: 35 pages, 20 figures, to appear in Springer book "Compressed Sensing
and Its Applications", 201
Small Width, Low Distortions: Quantized Random Embeddings of Low-complexity Sets
Under which conditions and with which distortions can we preserve the
pairwise-distances of low-complexity vectors, e.g., for structured sets such as
the set of sparse vectors or the one of low-rank matrices, when these are
mapped in a finite set of vectors? This work addresses this general question
through the specific use of a quantized and dithered random linear mapping
which combines, in the following order, a sub-Gaussian random projection in
of vectors in , a random translation, or "dither",
of the projected vectors and a uniform scalar quantizer of resolution
applied componentwise. Thanks to this quantized mapping we are first
able to show that, with high probability, an embedding of a bounded set
in can be achieved when
distances in the quantized and in the original domains are measured with the
- and -norm, respectively, and provided the number of quantized
observations is large before the square of the "Gaussian mean width" of
. In this case, we show that the embedding is actually
"quasi-isometric" and only suffers of both multiplicative and additive
distortions whose magnitudes decrease as for general sets, and as
for structured set, when increases. Second, when one is only
interested in characterizing the maximal distance separating two elements of
mapped to the same quantized vector, i.e., the "consistency width"
of the mapping, we show that for a similar number of measurements and with high
probability this width decays as for general sets and as for
structured ones when increases. Finally, as an important aspect of our
work, we also establish how the non-Gaussianity of the mapping impacts the
class of vectors that can be embedded or whose consistency width provably
decays when increases.Comment: Keywords: quantization, restricted isometry property, compressed
sensing, dimensionality reduction. 31 pages, 1 figur
Breaking the waves: asymmetric random periodic features for low-bitrate kernel machines
Many signal processing and machine learning applications are built from
evaluating a kernel on pairs of signals, e.g. to assess the similarity of an
incoming query to a database of known signals. This nonlinear evaluation can be
simplified to a linear inner product of the random Fourier features of those
signals: random projections followed by a periodic map, the complex
exponential. It is known that a simple quantization of those features
(corresponding to replacing the complex exponential by a different periodic map
that takes binary values, which is appealing for their transmission and
storage), distorts the approximated kernel, which may be undesirable in
practice. Our take-home message is that when the features of only one of the
two signals are quantized, the original kernel is recovered without distortion;
its practical interest appears in several cases where the kernel evaluations
are asymmetric by nature, such as a client-server scheme. Concretely, we
introduce the general framework of asymmetric random periodic features, where
the two signals of interest are observed through random periodic features:
random projections followed by a general periodic map, which is allowed to be
different for both signals. We derive the influence of those periodic maps on
the approximated kernel, and prove uniform probabilistic error bounds holding
for all signal pairs from an infinite low-complexity set. Interestingly, our
results allow the periodic maps to be discontinuous, thanks to a new
mathematical tool, i.e. the mean Lipschitz smoothness. We then apply this
generic framework to semi-quantized kernel machines (where only one signal has
quantized features and the other has classical random Fourier features), for
which we show theoretically that the approximated kernel remains unchanged
(with the associated error bound), and confirm the power of the approach with
numerical simulations
Analysis of SparseHash: an efficient embedding of set-similarity via sparse projections
Embeddings provide compact representations of signals in order to perform
efficient inference in a wide variety of tasks. In particular, random
projections are common tools to construct Euclidean distance-preserving
embeddings, while hashing techniques are extensively used to embed
set-similarity metrics, such as the Jaccard coefficient. In this letter, we
theoretically prove that a class of random projections based on sparse
matrices, called SparseHash, can preserve the Jaccard coefficient between the
supports of sparse signals, which can be used to estimate set similarities.
Moreover, besides the analysis, we provide an efficient implementation and we
test the performance in several numerical experiments, both on synthetic and
real datasets.Comment: 25 pages, 6 figure
A Quantized Johnson Lindenstrauss Lemma: The Finding of Buffon's Needle
In 1733, Georges-Louis Leclerc, Comte de Buffon in France, set the ground of
geometric probability theory by defining an enlightening problem: What is the
probability that a needle thrown randomly on a ground made of equispaced
parallel strips lies on two of them? In this work, we show that the solution to
this problem, and its generalization to dimensions, allows us to discover a
quantized form of the Johnson-Lindenstrauss (JL) Lemma, i.e., one that combines
a linear dimensionality reduction procedure with a uniform quantization of
precision . In particular, given a finite set of points and a distortion level , as soon as , we can (randomly) construct a mapping from
to that approximately
preserves the pairwise distances between the points of .
Interestingly, compared to the common JL Lemma, the mapping is quasi-isometric
and we observe both an additive and a multiplicative distortions on the
embedded distances. These two distortions, however, decay as when increases. Moreover, for coarse quantization, i.e., for high
compared to the set radius, the distortion is mainly additive, while
for small we tend to a Lipschitz isometric embedding. Finally, we
prove the existence of a "nearly" quasi-isometric embedding of into . This one involves a non-linear
distortion of the -distance in that vanishes for distant
points in this set. Noticeably, the additive distortion in this case is slower,
and decays as .Comment: 27 pages, 2 figures (note: this version corrects a few typos in the
abstract
Binary Adaptive Embeddings from Order Statistics of Random Projections
We use some of the largest order statistics of the random projections of a
reference signal to construct a binary embedding that is adapted to signals
correlated with such signal. The embedding is characterized from the analytical
standpoint and shown to provide improved performance on tasks such as
classification in a reduced-dimensionality space
- …