2,818 research outputs found
Fast k-means based on KNN Graph
In the era of big data, k-means clustering has been widely adopted as a basic
processing tool in various contexts. However, its computational cost could be
prohibitively high as the data size and the cluster number are large. It is
well known that the processing bottleneck of k-means lies in the operation of
seeking closest centroid in each iteration. In this paper, a novel solution
towards the scalability issue of k-means is presented. In the proposal, k-means
is supported by an approximate k-nearest neighbors graph. In the k-means
iteration, each data sample is only compared to clusters that its nearest
neighbors reside. Since the number of nearest neighbors we consider is much
less than k, the processing cost in this step becomes minor and irrelevant to
k. The processing bottleneck is therefore overcome. The most interesting thing
is that k-nearest neighbor graph is constructed by iteratively calling the fast
-means itself. Comparing with existing fast k-means variants, the proposed
algorithm achieves hundreds to thousands times speed-up while maintaining high
clustering quality. As it is tested on 10 million 512-dimensional data, it
takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
same scale of clustering, it would take 3 years for traditional k-means
Object Discovery via Cohesion Measurement
Color and intensity are two important components in an image. Usually, groups
of image pixels, which are similar in color or intensity, are an informative
representation for an object. They are therefore particularly suitable for
computer vision tasks, such as saliency detection and object proposal
generation. However, image pixels, which share a similar real-world color, may
be quite different since colors are often distorted by intensity. In this
paper, we reinvestigate the affinity matrices originally used in image
segmentation methods based on spectral clustering. A new affinity matrix, which
is robust to color distortions, is formulated for object discovery. Moreover, a
Cohesion Measurement (CM) for object regions is also derived based on the
formulated affinity matrix. Based on the new Cohesion Measurement, a novel
object discovery method is proposed to discover objects latent in an image by
utilizing the eigenvectors of the affinity matrix. Then we apply the proposed
method to both saliency detection and object proposal generation. Experimental
results on several evaluation benchmarks demonstrate that the proposed CM based
method has achieved promising performance for these two tasks.Comment: 14 pages, 14 figure
- …