Search CORE

44,838 research outputs found

Improved Densification of One Permutation Hashing

Author: Li Ping
Shrivastava Anshumali
Publication venue
Publication date: 18/06/2014
Field of study

The existing work on densification of one permutation hashing reduces the query processing cost of the

(K,L)

-parameterized Locality Sensitive Hashing (LSH) algorithm with minwise hashing, from

O(dKL)

to merely

O(d + KL)

, where

d

is the number of nonzeros of the data vector,

K

is the number of hashes in each hash table, and

L

is the number of hash tables. While that is a substantial improvement, our analysis reveals that the existing densification scheme is sub-optimal. In particular, there is no enough randomness in that procedure, which affects its accuracy on very sparse datasets. In this paper, we provide a new densification procedure which is provably better than the existing scheme. This improvement is more significant for very sparse datasets which are common over the web. The improved technique has the same cost of

O(d + KL)

for query processing, thereby making it strictly preferable over the existing procedure. Experimental evaluations on public datasets, in the task of hashing based near neighbor search, support our theoretical findings

arXiv.org e-Print Archive

CiteSeerX

DeepPermNet: Visual Permutation Learning

Author: Cherian Anoop
Cruz Rodrigo Santa
Fernando Basura
Gould Stephen
Publication venue
Publication date: 10/04/2017
Field of study

We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Unfortunately, permutation matrices are discrete, thereby posing difficulties for gradient-based methods. To this end, we resort to a continuous approximation of these matrices using doubly-stochastic matrices which we generate from standard CNN predictions using Sinkhorn iterations. Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on two challenging computer vision problems, namely, (i) relative attributes learning and (ii) self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for (i) and on the classification and segmentation tasks on the PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and Pattern Recognition CVPR 201

arXiv.org e-Print Archive