285 research outputs found

    Time for dithering: fast and quantized random embeddings via the restricted isometry property

    Full text link
    Recently, many works have focused on the characterization of non-linear dimensionality reduction methods obtained by quantizing linear embeddings, e.g., to reach fast processing time, efficient data compression procedures, novel geometry-preserving embeddings or to estimate the information/bits stored in this reduced data representation. In this work, we prove that many linear maps known to respect the restricted isometry property (RIP) can induce a quantized random embedding with controllable multiplicative and additive distortions with respect to the pairwise distances of the data points beings considered. In other words, linear matrices having fast matrix-vector multiplication algorithms (e.g., based on partial Fourier ensembles or on the adjacency matrix of unbalanced expanders) can be readily used in the definition of fast quantized embeddings with small distortions. This implication is made possible by applying right after the linear map an additive and random "dither" that stabilizes the impact of the uniform scalar quantization operator applied afterwards. For different categories of RIP matrices, i.e., for different linear embeddings of a metric space (KRn,q)(\mathcal K \subset \mathbb R^n, \ell_q) in (Rm,p)(\mathbb R^m, \ell_p) with p,q1p,q \geq 1, we derive upper bounds on the additive distortion induced by quantization, showing that it decays either when the embedding dimension mm increases or when the distance of a pair of embedded vectors in K\mathcal K decreases. Finally, we develop a novel "bi-dithered" quantization scheme, which allows for a reduced distortion that decreases when the embedding dimension grows and independently of the considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization, dither, restricted isometry property, dimensionality reduction, compressive sensing, low-complexity signal models, fast and structured sensing matrices, quantized rank-one projections (31 pages

    Quantized Compressive K-Means

    Full text link
    The recent framework of compressive statistical learning aims at designing tractable learning algorithms that use only a heavily compressed representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such a method: it estimates the centroids of data clusters from pooled, non-linear, random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources because the learning examples are compressed only after the sensing stage. The present work generalizes the sketching procedure initially defined in Compressive K-Means to a large class of periodic nonlinearities including hardware-friendly implementations that compressively acquire entire datasets. This idea is exemplified in a Quantized Compressive K-Means procedure, a variant of CKM that leverages 1-bit universal quantization (i.e. retaining the least significant bit of a standard uniform quantizer) as the periodic sketch nonlinearity. Trading for this resource-efficient signature (standard in most acquisition schemes) has almost no impact on the clustering performances, as illustrated by numerical experiments

    Quantization and Compressive Sensing

    Get PDF
    Quantization is an essential step in digitizing signals, and, therefore, an indispensable component of any modern acquisition system. This book chapter explores the interaction of quantization and compressive sensing and examines practical quantization strategies for compressive acquisition systems. Specifically, we first provide a brief overview of quantization and examine fundamental performance bounds applicable to any quantization approach. Next, we consider several forms of scalar quantizers, namely uniform, non-uniform, and 1-bit. We provide performance bounds and fundamental analysis, as well as practical quantizer designs and reconstruction algorithms that account for quantization. Furthermore, we provide an overview of Sigma-Delta (ΣΔ\Sigma\Delta) quantization in the compressed sensing context, and also discuss implementation issues, recovery algorithms and performance bounds. As we demonstrate, proper accounting for quantization and careful quantizer design has significant impact in the performance of a compressive acquisition system.Comment: 35 pages, 20 figures, to appear in Springer book "Compressed Sensing and Its Applications", 201

    Small Width, Low Distortions: Quantized Random Embeddings of Low-complexity Sets

    Full text link
    Under which conditions and with which distortions can we preserve the pairwise-distances of low-complexity vectors, e.g., for structured sets such as the set of sparse vectors or the one of low-rank matrices, when these are mapped in a finite set of vectors? This work addresses this general question through the specific use of a quantized and dithered random linear mapping which combines, in the following order, a sub-Gaussian random projection in RM\mathbb R^M of vectors in RN\mathbb R^N, a random translation, or "dither", of the projected vectors and a uniform scalar quantizer of resolution δ>0\delta>0 applied componentwise. Thanks to this quantized mapping we are first able to show that, with high probability, an embedding of a bounded set KRN\mathcal K \subset \mathbb R^N in δZM\delta \mathbb Z^M can be achieved when distances in the quantized and in the original domains are measured with the 1\ell_1- and 2\ell_2-norm, respectively, and provided the number of quantized observations MM is large before the square of the "Gaussian mean width" of K\mathcal K. In this case, we show that the embedding is actually "quasi-isometric" and only suffers of both multiplicative and additive distortions whose magnitudes decrease as M1/5M^{-1/5} for general sets, and as M1/2M^{-1/2} for structured set, when MM increases. Second, when one is only interested in characterizing the maximal distance separating two elements of K\mathcal K mapped to the same quantized vector, i.e., the "consistency width" of the mapping, we show that for a similar number of measurements and with high probability this width decays as M1/4M^{-1/4} for general sets and as 1/M1/M for structured ones when MM increases. Finally, as an important aspect of our work, we also establish how the non-Gaussianity of the mapping impacts the class of vectors that can be embedded or whose consistency width provably decays when MM increases.Comment: Keywords: quantization, restricted isometry property, compressed sensing, dimensionality reduction. 31 pages, 1 figur

    Breaking the waves: asymmetric random periodic features for low-bitrate kernel machines

    Full text link
    Many signal processing and machine learning applications are built from evaluating a kernel on pairs of signals, e.g. to assess the similarity of an incoming query to a database of known signals. This nonlinear evaluation can be simplified to a linear inner product of the random Fourier features of those signals: random projections followed by a periodic map, the complex exponential. It is known that a simple quantization of those features (corresponding to replacing the complex exponential by a different periodic map that takes binary values, which is appealing for their transmission and storage), distorts the approximated kernel, which may be undesirable in practice. Our take-home message is that when the features of only one of the two signals are quantized, the original kernel is recovered without distortion; its practical interest appears in several cases where the kernel evaluations are asymmetric by nature, such as a client-server scheme. Concretely, we introduce the general framework of asymmetric random periodic features, where the two signals of interest are observed through random periodic features: random projections followed by a general periodic map, which is allowed to be different for both signals. We derive the influence of those periodic maps on the approximated kernel, and prove uniform probabilistic error bounds holding for all signal pairs from an infinite low-complexity set. Interestingly, our results allow the periodic maps to be discontinuous, thanks to a new mathematical tool, i.e. the mean Lipschitz smoothness. We then apply this generic framework to semi-quantized kernel machines (where only one signal has quantized features and the other has classical random Fourier features), for which we show theoretically that the approximated kernel remains unchanged (with the associated error bound), and confirm the power of the approach with numerical simulations

    Analysis of SparseHash: an efficient embedding of set-similarity via sparse projections

    Get PDF
    Embeddings provide compact representations of signals in order to perform efficient inference in a wide variety of tasks. In particular, random projections are common tools to construct Euclidean distance-preserving embeddings, while hashing techniques are extensively used to embed set-similarity metrics, such as the Jaccard coefficient. In this letter, we theoretically prove that a class of random projections based on sparse matrices, called SparseHash, can preserve the Jaccard coefficient between the supports of sparse signals, which can be used to estimate set similarities. Moreover, besides the analysis, we provide an efficient implementation and we test the performance in several numerical experiments, both on synthetic and real datasets.Comment: 25 pages, 6 figure

    A Quantized Johnson Lindenstrauss Lemma: The Finding of Buffon's Needle

    Get PDF
    In 1733, Georges-Louis Leclerc, Comte de Buffon in France, set the ground of geometric probability theory by defining an enlightening problem: What is the probability that a needle thrown randomly on a ground made of equispaced parallel strips lies on two of them? In this work, we show that the solution to this problem, and its generalization to NN dimensions, allows us to discover a quantized form of the Johnson-Lindenstrauss (JL) Lemma, i.e., one that combines a linear dimensionality reduction procedure with a uniform quantization of precision δ>0\delta>0. In particular, given a finite set SRN\mathcal S \subset \mathbb R^N of SS points and a distortion level ϵ>0\epsilon>0, as soon as M>M0=O(ϵ2logS)M > M_0 = O(\epsilon^{-2} \log S), we can (randomly) construct a mapping from (S,2)(\mathcal S, \ell_2) to (δZM,1)(\delta\mathbb Z^M, \ell_1) that approximately preserves the pairwise distances between the points of S\mathcal S. Interestingly, compared to the common JL Lemma, the mapping is quasi-isometric and we observe both an additive and a multiplicative distortions on the embedded distances. These two distortions, however, decay as O((logS)/M)O(\sqrt{(\log S)/M}) when MM increases. Moreover, for coarse quantization, i.e., for high δ\delta compared to the set radius, the distortion is mainly additive, while for small δ\delta we tend to a Lipschitz isometric embedding. Finally, we prove the existence of a "nearly" quasi-isometric embedding of (S,2)(\mathcal S, \ell_2) into (δZM,2)(\delta\mathbb Z^M, \ell_2). This one involves a non-linear distortion of the 2\ell_2-distance in S\mathcal S that vanishes for distant points in this set. Noticeably, the additive distortion in this case is slower, and decays as O((logS)/M4)O(\sqrt[4]{(\log S)/M}).Comment: 27 pages, 2 figures (note: this version corrects a few typos in the abstract

    Binary Adaptive Embeddings from Order Statistics of Random Projections

    Get PDF
    We use some of the largest order statistics of the random projections of a reference signal to construct a binary embedding that is adapted to signals correlated with such signal. The embedding is characterized from the analytical standpoint and shown to provide improved performance on tasks such as classification in a reduced-dimensionality space
    corecore