2,549 research outputs found

    Anti-sparse coding for approximate nearest neighbor search

    Get PDF
    This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this framework allows, up to a scaling factor, the explicit reconstruction from the binary representation of the original vector. The paper also shows that random projections which are used in Locality Sensitive Hashing algorithms, are significantly outperformed by regular frames for both synthetic and real data if the number of bits exceeds the vector dimensionality, i.e., when high precision is required.Comment: submitted to ICASSP'2012; RR-7771 (2011

    Anti-sparse coding for approximate nearest neighbor search

    Get PDF
    submitted to ICASSP'2012This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this framework allows, up to a scaling factor, the explicit reconstruction from the binary representation of the original vector. The paper also shows that random projections which are used in Locality Sensitive Hashing algorithms, are significantly outperformed by regular frames for both synthetic and real data if the number of bits exceeds the vector dimensionality, i.e., when high precision is required.Cet article proposes une technique de binarisation qui s'appuie sur le concept rĂ©cent de codage anti-parcimonieux, et montre ses excellentes performances dans un contexte de recherche approximative de plus proches voisins. Contrairement aux mĂ©thodes concurrentes, le cadre proposĂ© permet, Ă  un facteur d'Ă©chelle prĂšs, la reconstruction explicite du vecteur encodĂ© Ă  partir de sa reprĂ©sentation binaire. L'article montre Ă©galement que les projections alĂ©atoires qui sont communĂ©ment utilisĂ©es dans les mĂ©thodes de hachage multi-dimensionnel peuvent ĂȘtre avantageusement remplacĂ©es par des frames rĂ©guliĂšres lorsque le nombre de bits excĂšde la dimension originale du descripteur

    Hashing for Similarity Search: A Survey

    Full text link
    Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

    Democratic Representations

    Full text link
    Minimization of the ℓ∞\ell_{\infty} (or maximum) norm subject to a constraint that imposes consistency to an underdetermined system of linear equations finds use in a large number of practical applications, including vector quantization, approximate nearest neighbor search, peak-to-average power ratio (or "crest factor") reduction in communication systems, and peak force minimization in robotics and control. This paper analyzes the fundamental properties of signal representations obtained by solving such a convex optimization problem. We develop bounds on the maximum magnitude of such representations using the uncertainty principle (UP) introduced by Lyubarskii and Vershynin, and study the efficacy of ℓ∞\ell_{\infty}-norm-based dynamic range reduction. Our analysis shows that matrices satisfying the UP, such as randomly subsampled Fourier or i.i.d. Gaussian matrices, enable the computation of what we call democratic representations, whose entries all have small and similar magnitude, as well as low dynamic range. To compute democratic representations at low computational complexity, we present two new, efficient convex optimization algorithms. We finally demonstrate the efficacy of democratic representations for dynamic range reduction in a DVB-T2-based broadcast system.Comment: Submitted to a Journa
    • 

    corecore