6,118 research outputs found
Fast Supervised Hashing with Decision Trees for High-Dimensional Data
Supervised hashing aims to map the original features to compact binary codes
that are able to preserve label based similarity in the Hamming space.
Non-linear hash functions have demonstrated the advantage over linear ones due
to their powerful generalization capability. In the literature, kernel
functions are typically used to achieve non-linearity in hashing, which achieve
encouraging retrieval performance at the price of slow evaluation and training
time. Here we propose to use boosted decision trees for achieving non-linearity
in hashing, which are fast to train and evaluate, hence more suitable for
hashing with high dimensional data. In our approach, we first propose
sub-modular formulations for the hashing binary code inference problem and an
efficient GraphCut based block search method for solving large-scale inference.
Then we learn hash functions by training boosted decision trees to fit the
binary codes. Experiments demonstrate that our proposed method significantly
outperforms most state-of-the-art methods in retrieval precision and training
time. Especially for high-dimensional data, our method is orders of magnitude
faster than many methods in terms of training time.Comment: Appearing in Proc. IEEE Conf. Computer Vision and Pattern
Recognition, 2014, Ohio, US
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
On the Error Resilience of Ordered Binary Decision Diagrams
Ordered Binary Decision Diagrams (OBDDs) are a data structure that is used in
an increasing number of fields of Computer Science (e.g., logic synthesis,
program verification, data mining, bioinformatics, and data protection) for
representing and manipulating discrete structures and Boolean functions. The
purpose of this paper is to study the error resilience of OBDDs and to design a
resilient version of this data structure, i.e., a self-repairing OBDD. In
particular, we describe some strategies that make reduced ordered OBDDs
resilient to errors in the indexes, that are associated to the input variables,
or in the pointers (i.e., OBDD edges) of the nodes. These strategies exploit
the inherent redundancy of the data structure, as well as the redundancy
introduced by its efficient implementations. The solutions we propose allow the
exact restoring of the original OBDD and are suitable to be applied to
classical software packages for the manipulation of OBDDs currently in use.
Another result of the paper is the definition of a new canonical OBDD model,
called {\em Index-resilient Reduced OBDD}, which guarantees that a node with a
faulty index has a reconstruction cost , where is the number of nodes
with corrupted index
Implementing and reasoning about hash-consed data structures in Coq
We report on four different approaches to implementing hash-consing in Coq
programs. The use cases include execution inside Coq, or execution of the
extracted OCaml code. We explore the different trade-offs between faithful use
of pristine extracted code, and code that is fine-tuned to make use of OCaml
programming constructs not available in Coq. We discuss the possible
consequences in terms of performances and guarantees. We use the running
example of binary decision diagrams and then demonstrate the generality of our
solutions by applying them to other examples of hash-consed data structures
- …