1 research outputs found
On Metric DBSCAN with Low Doubling Dimension
The density based clustering method {\em Density-Based Spatial Clustering of
Applications with Noise (DBSCAN)} is a popular method for outlier recognition
and has received tremendous attention from many different areas. A major issue
of the original DBSCAN is that the time complexity could be as large as
quadratic. Most of existing DBSCAN algorithms focus on developing efficient
index structures to speed up the procedure in low-dimensional Euclidean space.
However, the research of DBSCAN in high-dimensional Euclidean space or general
metric space is still quite limited, to the best of our knowledge. In this
paper, we consider the metric DBSCAN problem under the assumption that the
inliers (excluding the outliers) have a low doubling dimension. We apply a
novel randomized -center clustering idea to reduce the complexity of range
query, which is the most time consuming step in the whole DBSCAN procedure. Our
proposed algorithms do not need to build any complicated data structures and
are easy to be implemented in practice. The experimental results show that our
algorithms can significantly outperform the existing DBSCAN algorithms in terms
of running time