1,010 research outputs found
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Using Big Data to Enhance the Bosch Production Line Performance: A Kaggle Challenge
This paper describes our approach to the Bosch production line performance
challenge run by Kaggle.com. Maximizing the production yield is at the heart of
the manufacturing industry. At the Bosch assembly line, data is recorded for
products as they progress through each stage. Data science methods are applied
to this huge data repository consisting records of tests and measurements made
for each component along the assembly line to predict internal failures. We
found that it is possible to train a model that predicts which parts are most
likely to fail. Thus a smarter failure detection system can be built and the
parts tagged likely to fail can be salvaged to decrease operating costs and
increase the profit margins.Comment: IEEE Big Data 2016 Conferenc
The Road From Classical to Quantum Codes: A Hashing Bound Approaching Design Procedure
Powerful Quantum Error Correction Codes (QECCs) are required for stabilizing
and protecting fragile qubits against the undesirable effects of quantum
decoherence. Similar to classical codes, hashing bound approaching QECCs may be
designed by exploiting a concatenated code structure, which invokes iterative
decoding. Therefore, in this paper we provide an extensive step-by-step
tutorial for designing EXtrinsic Information Transfer (EXIT) chart aided
concatenated quantum codes based on the underlying quantum-to-classical
isomorphism. These design lessons are then exemplified in the context of our
proposed Quantum Irregular Convolutional Code (QIRCC), which constitutes the
outer component of a concatenated quantum code. The proposed QIRCC can be
dynamically adapted to match any given inner code using EXIT charts, hence
achieving a performance close to the hashing bound. It is demonstrated that our
QIRCC-based optimized design is capable of operating within 0.4 dB of the noise
limit
Robust hashing for image authentication using quaternion discrete Fourier transform and log-polar transform
International audienceIn this work, a novel robust image hashing scheme for image authentication is proposed based on the combination of the quaternion discrete Fourier transform (QDFT) with the log-polar transform. QDFT offers a sound way to jointly deal with the three channels of color images. The key features of the present method rely on (i) the computation of a secondary image using a log-polar transform; and (ii) the extraction from this image of low frequency QDFT coefficients' magnitude. The final image hash is generated according to the correlation of these magnitude coefficients and is scrambled by a secret key to enhance the system security. Experiments were conducted in order to analyze and identify the most appropriate parameter values of the proposed method and also to compare its performance to some reference methods in terms of receiver operating characteristics curves. The results show that the proposed scheme offers a good sensitivity to image content alterations and is robust to the common content-preserving operations, and especially to large angle rotation operations
- …