2,012 research outputs found
An Efficient Index for Visual Search in Appearance-based SLAM
Vector-quantization can be a computationally expensive step in visual
bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance
SLAM needs to tackle this problem for an efficient real-time operation. We
propose an effective method to speed up the vector-quantization process in
BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS)
algorithm to this aim, and experimentally show that it can outperform the
state-of-the-art. The graph-based search structure used in GNNS can efficiently
be integrated into the BoW model and the SLAM framework. The graph-based index,
which is a k-NN graph, is built over the vocabulary words and can be extracted
from the BoW's vocabulary construction procedure, by adding one iteration to
the k-means clustering, which adds small extra cost. Moreover, exploiting the
fact that images acquired for appearance-based SLAM are sequential, GNNS search
can be initiated judiciously which helps increase the speedup of the
quantization process considerably
k-NN ๊ฒ์ ๋ฐ k-NN ๊ทธ๋ํ ์์ฑ์ ์ํ ๊ณ ์ ๊ทผ์ฌ ์๊ณ ๋ฆฌ์ฆ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ์ ๊ธฐยท์ปดํจํฐ๊ณตํ๋ถ, 2015. 2. ์ด์๊ตฌ.Finding k-nearest neighbors (k-NN) is an essential part of recommeder systems, information retrieval, and many data mining and machine learning algorithms. However, there are two main problems in finding k-nearest neighbors: 1) Existing approaches require a huge amount of time when the number of objects or dimensions is scale up. 2) The k-NN computation methods do not show the consistent performance over different search tasks and types of data. In this dissertation, we present fast and versatile algorithms for finding k-nearest neighbors in order to cope with these problems. The main contributions are summarized as follows: first, we present an efficient and scalable algorithm for finding an approximate k-NN graph by filtering node pairs whose large value dimensions do not match at all. Second, a fast collaborative filtering algorithm that utilizes k-NN graph is presented. The main idea of this approach is to reverse the process of finding k-nearest neighbors in item-based collaborative filtering. Last, we propose a fast approximate algorithm for k-NN search by selecting query-specific signatures from a signature pool to pick high-quality k-NN candidates.The experimental results show that the proposed algorithms guarantee a high level of accuracy while also being much faster than the other algorithms over different types of search tasks and datasets.Abstract i
Contents iii
List of Figures vii
List of Tables xi
Chapter 1 Introduction 1
1.1 Motivation and Challenges . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Fast Approximation . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Versatility . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Our Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Greedy Filtering . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Signature Selection LSH . . . . . . . . . . . . . . . . . . . 7
1.2.3 Reversed CF . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 2 Background and Related Work 14
2.1 k-NN Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Locality Sensitive Hashing . . . . . . . . . . . . . . . . . . 15
2.1.2 LSH-based k-NN Search . . . . . . . . . . . . . . . . . . . 16
2.2 k-NN Graph Construction . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 LSH-based Approach . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 Clustering-based Approach . . . . . . . . . . . . . . . . . 19
2.2.3 Heuristic-based Approach . . . . . . . . . . . . . . . . . . 20
2.2.4 Similarity Join . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 3 Fast Approximate k-NN Graph Construction 26
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Constructing a k-Nearest Neighbor Graph . . . . . . . . . . . . . 29
3.3.1 Greedy Filtering . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Prefix Selection Scheme . . . . . . . . . . . . . . . . . . . 32
3.3.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.2 Graph Construction Time . . . . . . . . . . . . . . . . . . 39
3.4.3 Graph Accuracy . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 44
3.5.2 Performance Comparison . . . . . . . . . . . . . . . . . . 48
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Chapter 4 Fast Collaborative Filtering 53
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Fast Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Nearest Neighbor Graph Construction . . . . . . . . . . . 58
4.3.2 Fast Recommendation Algorithm . . . . . . . . . . . . . . 60
4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 64
4.4.2 Overall Comparison . . . . . . . . . . . . . . . . . . . . . 65
4.4.3 Effects of Parameter Changes . . . . . . . . . . . . . . . . 68
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Chapter 5 Fast Approximate k-NN Search 72
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Signature Selection LSH . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.1 Data-dependent LSH . . . . . . . . . . . . . . . . . . . . . 75
5.2.2 Signature Pool Generation . . . . . . . . . . . . . . . . . . 76
5.2.3 Signature Selection . . . . . . . . . . . . . . . . . . . . . . 79
5.2.4 Optimization Techniques . . . . . . . . . . . . . . . . . . 83
5.3 S2LSH for Graph Construction . . . . . . . . . . . . . . . . . . . 84
5.3.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.2 Signature Selection . . . . . . . . . . . . . . . . . . . . . . 84
5.3.3 Optimization Techniques . . . . . . . . . . . . . . . . . . 85
5.4 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 87
5.5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . 91
5.5.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . 97
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Chapter 6 Conclusion 103
Bibliography 105
์ด๋ก 113Docto
Active Semi-Supervised Learning Using Sampling Theory for Graph Signals
We consider the problem of offline, pool-based active semi-supervised
learning on graphs. This problem is important when the labeled data is scarce
and expensive whereas unlabeled data is easily available. The data points are
represented by the vertices of an undirected graph with the similarity between
them captured by the edge weights. Given a target number of nodes to label, the
goal is to choose those nodes that are most informative and then predict the
unknown labels. We propose a novel framework for this problem based on our
recent results on sampling theory for graph signals. A graph signal is a
real-valued function defined on each node of the graph. A notion of frequency
for such signals can be defined using the spectrum of the graph Laplacian
matrix. The sampling theory for graph signals aims to extend the traditional
Nyquist-Shannon sampling theory by allowing us to identify the class of graph
signals that can be reconstructed from their values on a subset of vertices.
This approach allows us to define a criterion for active learning based on
sampling set selection which aims at maximizing the frequency of the signals
that can be reconstructed from their samples on the set. Experiments show the
effectiveness of our method.Comment: 10 pages, 6 figures, To appear in KDD'1
An Efficient Approximate kNN Graph Method for Diffusion on Image Retrieval
The application of the diffusion in many computer vision and artificial
intelligence projects has been shown to give excellent improvements in
performance. One of the main bottlenecks of this technique is the quadratic
growth of the kNN graph size due to the high-quantity of new connections
between nodes in the graph, resulting in long computation times. Several
strategies have been proposed to address this, but none are effective and
efficient. Our novel technique, based on LSH projections, obtains the same
performance as the exact kNN graph after diffusion, but in less time
(approximately 18 times faster on a dataset of a hundred thousand images). The
proposed method was validated and compared with other state-of-the-art on
several public image datasets, including Oxford5k, Paris6k, and Oxford105k
Recommended from our members
Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.
The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia
- โฆ