27,181 research outputs found
ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities
Density-based spatial clustering of applications with noise (DBSCAN) is a
data clustering algorithm which has the high-performance rate for dataset where
clusters have the constant density of data points. One of the significant
attributes of this algorithm is noise cancellation. However, DBSCAN
demonstrates reduced performances for clusters with different densities.
Therefore, in this paper, an adaptive DBSCAN is proposed which can work
significantly well for identifying clusters with varying densities.Comment: To be published in the 4th IEEE International Conference on
Electrical Engineering and Information & Communication Technology (iCEEiCT
2018
MOA: Massive Online Analysis, a framework for stream classification and clustering.
Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problem of scaling up the implementation of state of the art algorithms to real world dataset sizes. It contains collection of offline and online for both classification and clustering as well as tools for evaluation. In particular, for classification it implements boosting, bagging, and Hoeffding Trees, all with and without Naive Bayes classifiers at the leaves. For clustering, it implements StreamKM++, CluStream, ClusTree, Den-Stream, D-Stream and CobWeb. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily apply and compare several algorithms to real world data set and settings. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license
SDSS-RASS: Next Generation of Cluster-Finding Algorithms
We outline here the next generation of cluster-finding algorithms. We show
how advances in Computer Science and Statistics have helped develop robust,
fast algorithms for finding clusters of galaxies in large multi-dimensional
astronomical databases like the Sloan Digital Sky Survey (SDSS). Specifically,
this paper presents four new advances: (1) A new semi-parametric algorithm -
nicknamed ``C4'' - for jointly finding clusters of galaxies in the SDSS and
ROSAT All-Sky Survey databases; (2) The introduction of the False Discovery
Rate into Astronomy; (3) The role of kernel shape in optimizing cluster
detection; (4) A new determination of the X-ray Cluster Luminosity Function
which has bearing on the existence of a ``deficit'' of high redshift, high
luminosity clusters. This research is part of our ``Computational
AstroStatistics'' collaboration (see Nichol et al. 2000) and the algorithms and
techniques discussed herein will form part of the ``Virtual Observatory''
analysis toolkit.Comment: To appear in Proceedings of MPA/MPE/ESO Conference "Mining the Sky",
July 31 - August 4, 2000, Garching, German
- âŠ