10,034 research outputs found
Fast k-means based on KNN Graph
In the era of big data, k-means clustering has been widely adopted as a basic
processing tool in various contexts. However, its computational cost could be
prohibitively high as the data size and the cluster number are large. It is
well known that the processing bottleneck of k-means lies in the operation of
seeking closest centroid in each iteration. In this paper, a novel solution
towards the scalability issue of k-means is presented. In the proposal, k-means
is supported by an approximate k-nearest neighbors graph. In the k-means
iteration, each data sample is only compared to clusters that its nearest
neighbors reside. Since the number of nearest neighbors we consider is much
less than k, the processing cost in this step becomes minor and irrelevant to
k. The processing bottleneck is therefore overcome. The most interesting thing
is that k-nearest neighbor graph is constructed by iteratively calling the fast
-means itself. Comparing with existing fast k-means variants, the proposed
algorithm achieves hundreds to thousands times speed-up while maintaining high
clustering quality. As it is tested on 10 million 512-dimensional data, it
takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the
same scale of clustering, it would take 3 years for traditional k-means
Mathematical approaches to digital color image denoising
Many mathematical models have been designed to remove noise from images. Most of them focus on grey value images with additive artificial noise. Only very few specifically target natural color photos taken by a digital camera with real noise. Noise in natural color photos have special characteristics that are substantially different
from those that have been added artificially.
In this thesis previous denoising models are reviewed. We analyze the strengths and weakness of existing denoising models by showing where they perform well and where they don't. We put special focus on two models: The steering kernel regression model and the non-local model. For Kernel Regression model, an adaptive bilateral
filter is introduced as complementary to enhance it. Also a non-local bilateral filter is proposed as an application of the idea of non-local means filter. Then the idea of cross-channel denoising is proposed in this thesis. It is effective in
denoising monochromatic images by understanding the characteristics of digital noise in natural color images. A non-traditional color space is also introduced specifically for this purpose. The cross-channel paradigm can be applied to most of the exisiting models to greatly improve their performance for denoising natural color images.Ph.D.Committee Chair: Haomin Zhou; Committee Member: Luca Dieci; Committee Member: Ronghua Pan; Committee Member: Sung Ha Kang; Committee Member: Yang Wan
- …