10,029 research outputs found
Fast k-means based on KNN Graph
In the era of big data, k-means clustering has been widely adopted as a basic
processing tool in various contexts. However, its computational cost could be
prohibitively high as the data size and the cluster number are large. It is
well known that the processing bottleneck of k-means lies in the operation of
seeking closest centroid in each iteration. In this paper, a novel solution
towards the scalability issue of k-means is presented. In the proposal, k-means
is supported by an approximate k-nearest neighbors graph. In the k-means
iteration, each data sample is only compared to clusters that its nearest
neighbors reside. Since the number of nearest neighbors we consider is much
less than k, the processing cost in this step becomes minor and irrelevant to
k. The processing bottleneck is therefore overcome. The most interesting thing
is that k-nearest neighbor graph is constructed by iteratively calling the fast
-means itself. Comparing with existing fast k-means variants, the proposed
algorithm achieves hundreds to thousands times speed-up while maintaining high
clustering quality. As it is tested on 10 million 512-dimensional data, it
takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
same scale of clustering, it would take 3 years for traditional k-means
Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models
Deep learning has shown state-of-art classification performance on datasets
such as ImageNet, which contain a single object in each image. However,
multi-object classification is far more challenging. We present a unified
framework which leverages the strengths of multiple machine learning methods,
viz deep learning, probabilistic models and kernel methods to obtain
state-of-art performance on Microsoft COCO, consisting of non-iconic images. We
incorporate contextual information in natural images through a conditional
latent tree probabilistic model (CLTM), where the object co-occurrences are
conditioned on the extracted fc7 features from pre-trained Imagenet CNN as
input. We learn the CLTM tree structure using conditional pairwise
probabilities for object co-occurrences, estimated through kernel methods, and
we learn its node and edge potentials by training a new 3-layer neural network,
which takes fc7 features as input. Object classification is carried out via
inference on the learnt conditional tree model, and we obtain significant gain
in precision-recall and F-measures on MS-COCO, especially for difficult object
categories. Moreover, the latent variables in the CLTM capture scene
information: the images with top activations for a latent node have common
themes such as being a grasslands or a food scene, and on on. In addition, we
show that a simple k-means clustering of the inferred latent nodes alone
significantly improves scene classification performance on the MIT-Indoor
dataset, without the need for any retraining, and without using scene labels
during training. Thus, we present a unified framework for multi-object
classification and unsupervised scene understanding
An Efficient Index for Visual Search in Appearance-based SLAM
Vector-quantization can be a computationally expensive step in visual
bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance
SLAM needs to tackle this problem for an efficient real-time operation. We
propose an effective method to speed up the vector-quantization process in
BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS)
algorithm to this aim, and experimentally show that it can outperform the
state-of-the-art. The graph-based search structure used in GNNS can efficiently
be integrated into the BoW model and the SLAM framework. The graph-based index,
which is a k-NN graph, is built over the vocabulary words and can be extracted
from the BoW's vocabulary construction procedure, by adding one iteration to
the k-means clustering, which adds small extra cost. Moreover, exploiting the
fact that images acquired for appearance-based SLAM are sequential, GNNS search
can be initiated judiciously which helps increase the speedup of the
quantization process considerably
Fast Approximate -Means via Cluster Closures
-means, a simple and effective clustering algorithm, is one of the most
widely used algorithms in multimedia and computer vision community. Traditional
-means is an iterative algorithm---in each iteration new cluster centers are
computed and each data point is re-assigned to its nearest center. The cluster
re-assignment step becomes prohibitively expensive when the number of data
points and cluster centers are large.
In this paper, we propose a novel approximate -means algorithm to greatly
reduce the computational complexity in the assignment step. Our approach is
motivated by the observation that most active points changing their cluster
assignments at each iteration are located on or near cluster boundaries. The
idea is to efficiently identify those active points by pre-assembling the data
into groups of neighboring points using multiple random spatial partition
trees, and to use the neighborhood information to construct a closure for each
cluster, in such a way only a small number of cluster candidates need to be
considered when assigning a data point to its nearest cluster. Using complexity
analysis, image data clustering, and applications to image retrieval, we show
that our approach out-performs state-of-the-art approximate -means
algorithms in terms of clustering quality and efficiency
- …