2,012 research outputs found
Locality-Sensitive Hashing with Margin Based Feature Selection
We propose a learning method with feature selection for Locality-Sensitive
Hashing. Locality-Sensitive Hashing converts feature vectors into bit arrays.
These bit arrays can be used to perform similarity searches and personal
authentication. The proposed method uses bit arrays longer than those used in
the end for similarity and other searches and by learning selects the bits that
will be used. We demonstrated this method can effectively perform optimization
for cases such as fingerprint images with a large number of labels and
extremely few data that share the same labels, as well as verifying that it is
also effective for natural images, handwritten digits, and speech features.Comment: 9 pages, 6 figures, 3 table
Hyperplane Arrangements and Locality-Sensitive Hashing with Lift
Locality-sensitive hashing converts high-dimensional feature vectors, such as
image and speech, into bit arrays and allows high-speed similarity calculation
with the Hamming distance. There is a hashing scheme that maps feature vectors
to bit arrays depending on the signs of the inner products between feature
vectors and the normal vectors of hyperplanes placed in the feature space. This
hashing can be seen as a discretization of the feature space by hyperplanes. If
labels for data are given, one can determine the hyperplanes by using learning
algorithms. However, many proposed learning methods do not consider the
hyperplanes' offsets. Not doing so decreases the number of partitioned regions,
and the correlation between Hamming distances and Euclidean distances becomes
small. In this paper, we propose a lift map that converts learning algorithms
without the offsets to the ones that take into account the offsets. With this
method, the learning methods without the offsets give the discretizations of
spaces as if it takes into account the offsets. For the proposed method, we
input several high-dimensional feature data sets and studied the relationship
between the statistical characteristics of data, the number of hyperplanes, and
the effect of the proposed method.Comment: 9 pages, 7 figure
- …