1,906 research outputs found
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
In-packet Bloom filters: Design and networking applications
The Bloom filter (BF) is a well-known space-efficient data structure that
answers set membership queries with some probability of false positives. In an
attempt to solve many of the limitations of current inter-networking
architectures, some recent proposals rely on including small BFs in packet
headers for routing, security, accountability or other purposes that move
application states into the packets themselves. In this paper, we consider the
design of such in-packet Bloom filters (iBF). Our main contributions are
exploring the design space and the evaluation of a series of extensions (1) to
increase the practicality and performance of iBFs, (2) to enable
false-negative-free element deletion, and (3) to provide security enhancements.
In addition to the theoretical estimates, extensive simulations of the multiple
design parameters and implementation alternatives validate the usefulness of
the extensions, providing for enhanced and novel iBF networking applications.Comment: 15 pages, 11 figures, preprint submitted to Elsevier COMNET Journa
GPUs as Storage System Accelerators
Massively multicore processors, such as Graphics Processing Units (GPUs),
provide, at a comparable price, a one order of magnitude higher peak
performance than traditional CPUs. This drop in the cost of computation, as any
order-of-magnitude drop in the cost per unit of performance for a class of
system components, triggers the opportunity to redesign systems and to explore
new ways to engineer them to recalibrate the cost-to-performance relation. This
project explores the feasibility of harnessing GPUs' computational power to
improve the performance, reliability, or security of distributed storage
systems. In this context, we present the design of a storage system prototype
that uses GPU offloading to accelerate a number of computationally intensive
primitives based on hashing, and introduce techniques to efficiently leverage
the processing power of GPUs. We evaluate the performance of this prototype
under two configurations: as a content addressable storage system that
facilitates online similarity detection between successive versions of the same
file and as a traditional system that uses hashing to preserve data integrity.
Further, we evaluate the impact of offloading to the GPU on competing
applications' performance. Our results show that this technique can bring
tangible performance gains without negatively impacting the performance of
concurrently running applications.Comment: IEEE Transactions on Parallel and Distributed Systems, 201
Optimized Entanglement Purification
We investigate novel protocols for entanglement purification of qubit Bell
pairs. Employing genetic algorithms for the design of the purification circuit,
we obtain shorter circuits achieving higher success rates and better final
fidelities than what is currently available in the literature. We provide a
software tool for analytical and numerical study of the generated purification
circuits, under customizable error models. These new purification protocols
pave the way to practical implementations of modular quantum computers and
quantum repeaters. Our approach is particularly attentive to the effects of
finite resources and imperfect local operations - phenomena neglected in the
usual asymptotic approach to the problem. The choice of the building blocks
permitted in the construction of the circuits is based on a thorough
enumeration of the local Clifford operations that act as permutations on the
basis of Bell states
- …