2,221 research outputs found
Binary Adaptive Embeddings from Order Statistics of Random Projections
We use some of the largest order statistics of the random projections of a
reference signal to construct a binary embedding that is adapted to signals
correlated with such signal. The embedding is characterized from the analytical
standpoint and shown to provide improved performance on tasks such as
classification in a reduced-dimensionality space
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Time for dithering: fast and quantized random embeddings via the restricted isometry property
Recently, many works have focused on the characterization of non-linear
dimensionality reduction methods obtained by quantizing linear embeddings,
e.g., to reach fast processing time, efficient data compression procedures,
novel geometry-preserving embeddings or to estimate the information/bits stored
in this reduced data representation. In this work, we prove that many linear
maps known to respect the restricted isometry property (RIP) can induce a
quantized random embedding with controllable multiplicative and additive
distortions with respect to the pairwise distances of the data points beings
considered. In other words, linear matrices having fast matrix-vector
multiplication algorithms (e.g., based on partial Fourier ensembles or on the
adjacency matrix of unbalanced expanders) can be readily used in the definition
of fast quantized embeddings with small distortions. This implication is made
possible by applying right after the linear map an additive and random "dither"
that stabilizes the impact of the uniform scalar quantization operator applied
afterwards. For different categories of RIP matrices, i.e., for different
linear embeddings of a metric space
in with , we derive upper bounds on the
additive distortion induced by quantization, showing that it decays either when
the embedding dimension increases or when the distance of a pair of
embedded vectors in decreases. Finally, we develop a novel
"bi-dithered" quantization scheme, which allows for a reduced distortion that
decreases when the embedding dimension grows and independently of the
considered pair of vectors.Comment: Keywords: random projections, non-linear embeddings, quantization,
dither, restricted isometry property, dimensionality reduction, compressive
sensing, low-complexity signal models, fast and structured sensing matrices,
quantized rank-one projections (31 pages
- …