2,221 research outputs found

    Binary Adaptive Embeddings from Order Statistics of Random Projections

    Get PDF
    We use some of the largest order statistics of the random projections of a reference signal to construct a binary embedding that is adapted to signals correlated with such signal. The embedding is characterized from the analytical standpoint and shown to provide improved performance on tasks such as classification in a reduced-dimensionality space

    Hashing for Similarity Search: A Survey

    Full text link
    Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

    Time for dithering: fast and quantized random embeddings via the restricted isometry property

    Full text link
    Recently, many works have focused on the characterization of non-linear dimensionality reduction methods obtained by quantizing linear embeddings, e.g., to reach fast processing time, efficient data compression procedures, novel geometry-preserving embeddings or to estimate the information/bits stored in this reduced data representation. In this work, we prove that many linear maps known to respect the restricted isometry property (RIP) can induce a quantized random embedding with controllable multiplicative and additive distortions with respect to the pairwise distances of the data points beings considered. In other words, linear matrices having fast matrix-vector multiplication algorithms (e.g., based on partial Fourier ensembles or on the adjacency matrix of unbalanced expanders) can be readily used in the definition of fast quantized embeddings with small distortions. This implication is made possible by applying right after the linear map an additive and random "dither" that stabilizes the impact of the uniform scalar quantization operator applied afterwards. For different categories of RIP matrices, i.e., for different linear embeddings of a metric space (K⊂Rn,ℓq)(\mathcal K \subset \mathbb R^n, \ell_q) in (Rm,ℓp)(\mathbb R^m, \ell_p) with p,q≥1p,q \geq 1, we derive upper bounds on the additive distortion induced by quantization, showing that it decays either when the embedding dimension mm increases or when the distance of a pair of embedded vectors in K\mathcal K decreases. Finally, we develop a novel "bi-dithered" quantization scheme, which allows for a reduced distortion that decreases when the embedding dimension grows and independently of the considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization, dither, restricted isometry property, dimensionality reduction, compressive sensing, low-complexity signal models, fast and structured sensing matrices, quantized rank-one projections (31 pages
    • …
    corecore