2,549 research outputs found
Anti-sparse coding for approximate nearest neighbor search
This paper proposes a binarization scheme for vectors of high dimension based
on the recent concept of anti-sparse coding, and shows its excellent
performance for approximate nearest neighbor search. Unlike other binarization
schemes, this framework allows, up to a scaling factor, the explicit
reconstruction from the binary representation of the original vector. The paper
also shows that random projections which are used in Locality Sensitive Hashing
algorithms, are significantly outperformed by regular frames for both synthetic
and real data if the number of bits exceeds the vector dimensionality, i.e.,
when high precision is required.Comment: submitted to ICASSP'2012; RR-7771 (2011
Anti-sparse coding for approximate nearest neighbor search
submitted to ICASSP'2012This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this framework allows, up to a scaling factor, the explicit reconstruction from the binary representation of the original vector. The paper also shows that random projections which are used in Locality Sensitive Hashing algorithms, are significantly outperformed by regular frames for both synthetic and real data if the number of bits exceeds the vector dimensionality, i.e., when high precision is required.Cet article proposes une technique de binarisation qui s'appuie sur le concept rĂ©cent de codage anti-parcimonieux, et montre ses excellentes performances dans un contexte de recherche approximative de plus proches voisins. Contrairement aux mĂ©thodes concurrentes, le cadre proposĂ© permet, Ă un facteur d'Ă©chelle prĂšs, la reconstruction explicite du vecteur encodĂ© Ă partir de sa reprĂ©sentation binaire. L'article montre Ă©galement que les projections alĂ©atoires qui sont communĂ©ment utilisĂ©es dans les mĂ©thodes de hachage multi-dimensionnel peuvent ĂȘtre avantageusement remplacĂ©es par des frames rĂ©guliĂšres lorsque le nombre de bits excĂšde la dimension originale du descripteur
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Democratic Representations
Minimization of the (or maximum) norm subject to a constraint
that imposes consistency to an underdetermined system of linear equations finds
use in a large number of practical applications, including vector quantization,
approximate nearest neighbor search, peak-to-average power ratio (or "crest
factor") reduction in communication systems, and peak force minimization in
robotics and control. This paper analyzes the fundamental properties of signal
representations obtained by solving such a convex optimization problem. We
develop bounds on the maximum magnitude of such representations using the
uncertainty principle (UP) introduced by Lyubarskii and Vershynin, and study
the efficacy of -norm-based dynamic range reduction. Our
analysis shows that matrices satisfying the UP, such as randomly subsampled
Fourier or i.i.d. Gaussian matrices, enable the computation of what we call
democratic representations, whose entries all have small and similar magnitude,
as well as low dynamic range. To compute democratic representations at low
computational complexity, we present two new, efficient convex optimization
algorithms. We finally demonstrate the efficacy of democratic representations
for dynamic range reduction in a DVB-T2-based broadcast system.Comment: Submitted to a Journa
- âŠ