193 research outputs found
Anti-sparse coding for approximate nearest neighbor search
This paper proposes a binarization scheme for vectors of high dimension based
on the recent concept of anti-sparse coding, and shows its excellent
performance for approximate nearest neighbor search. Unlike other binarization
schemes, this framework allows, up to a scaling factor, the explicit
reconstruction from the binary representation of the original vector. The paper
also shows that random projections which are used in Locality Sensitive Hashing
algorithms, are significantly outperformed by regular frames for both synthetic
and real data if the number of bits exceeds the vector dimensionality, i.e.,
when high precision is required.Comment: submitted to ICASSP'2012; RR-7771 (2011
ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms
This paper describes ANN-Benchmarks, a tool for evaluating the performance of
in-memory approximate nearest neighbor algorithms. It provides a standard
interface for measuring the performance and quality achieved by nearest
neighbor algorithms on different standard data sets. It supports several
different ways of integrating -NN algorithms, and its configuration system
automatically tests a range of parameter settings for each algorithm.
Algorithms are compared with respect to many different (approximate) quality
measures, and adding more is easy and fast; the included plotting front-ends
can visualise these as images, plots, and websites with interactive
plots. ANN-Benchmarks aims to provide a constantly updated overview of the
current state of the art of -NN algorithms. In the short term, this overview
allows users to choose the correct -NN algorithm and parameters for their
similarity search task; in the longer term, algorithm designers will be able to
use this overview to test and refine automatic parameter tuning. The paper
gives an overview of the system, evaluates the results of the benchmark, and
points out directions for future work. Interestingly, very different approaches
to -NN search yield comparable quality-performance trade-offs. The system is
available at http://ann-benchmarks.com .Comment: Full version of the SISAP 2017 conference paper. v2: Updated the
abstract to avoid arXiv linking to the wrong UR
Learning to compress and search visual data in large-scale systems
The problem of high-dimensional and large-scale representation of visual data
is addressed from an unsupervised learning perspective. The emphasis is put on
discrete representations, where the description length can be measured in bits
and hence the model capacity can be controlled. The algorithmic infrastructure
is developed based on the synthesis and analysis prior models whose
rate-distortion properties, as well as capacity vs. sample complexity
trade-offs are carefully optimized. These models are then extended to
multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is
further evolved as a powerful deep neural network architecture with fast and
sample-efficient training and discrete representations. For the developed
algorithms, three important applications are developed. First, the problem of
large-scale similarity search in retrieval systems is addressed, where a
double-stage solution is proposed leading to faster query times and shorter
database storage. Second, the problem of learned image compression is targeted,
where the proposed models can capture more redundancies from the training
images than the conventional compression codecs. Finally, the proposed
algorithms are used to solve ill-posed inverse problems. In particular, the
problems of image denoising and compressive sensing are addressed with
promising results.Comment: PhD thesis dissertatio
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
Bilateral filters have wide spread use due to their edge-preserving
properties. The common use case is to manually choose a parametric filter type,
usually a Gaussian filter. In this paper, we will generalize the
parametrization and in particular derive a gradient descent algorithm so the
filter parameters can be learned from data. This derivation allows to learn
high dimensional linear filters that operate in sparsely populated feature
spaces. We build on the permutohedral lattice construction for efficient
filtering. The ability to learn more general forms of high-dimensional filters
can be used in several diverse applications. First, we demonstrate the use in
applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise
potentials in densely connected conditional random fields and apply these to
different image segmentation tasks. Finally, we introduce layers of bilateral
filters in CNNs and propose bilateral neural networks for the use of
high-dimensional sparse data. This view provides new ways to encode model
structure into network architectures. A diverse set of experiments empirically
validates the usage of general forms of filters
- …