27,181 research outputs found

    ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities

    Full text link
    Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes of this algorithm is noise cancellation. However, DBSCAN demonstrates reduced performances for clusters with different densities. Therefore, in this paper, an adaptive DBSCAN is proposed which can work significantly well for identifying clusters with varying densities.Comment: To be published in the 4th IEEE International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT 2018

    MOA: Massive Online Analysis, a framework for stream classification and clustering.

    Get PDF
    Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problem of scaling up the implementation of state of the art algorithms to real world dataset sizes. It contains collection of offline and online for both classification and clustering as well as tools for evaluation. In particular, for classification it implements boosting, bagging, and Hoeffding Trees, all with and without Naive Bayes classifiers at the leaves. For clustering, it implements StreamKM++, CluStream, ClusTree, Den-Stream, D-Stream and CobWeb. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily apply and compare several algorithms to real world data set and settings. MOA supports bi-directional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license

    SDSS-RASS: Next Generation of Cluster-Finding Algorithms

    Get PDF
    We outline here the next generation of cluster-finding algorithms. We show how advances in Computer Science and Statistics have helped develop robust, fast algorithms for finding clusters of galaxies in large multi-dimensional astronomical databases like the Sloan Digital Sky Survey (SDSS). Specifically, this paper presents four new advances: (1) A new semi-parametric algorithm - nicknamed ``C4'' - for jointly finding clusters of galaxies in the SDSS and ROSAT All-Sky Survey databases; (2) The introduction of the False Discovery Rate into Astronomy; (3) The role of kernel shape in optimizing cluster detection; (4) A new determination of the X-ray Cluster Luminosity Function which has bearing on the existence of a ``deficit'' of high redshift, high luminosity clusters. This research is part of our ``Computational AstroStatistics'' collaboration (see Nichol et al. 2000) and the algorithms and techniques discussed herein will form part of the ``Virtual Observatory'' analysis toolkit.Comment: To appear in Proceedings of MPA/MPE/ESO Conference "Mining the Sky", July 31 - August 4, 2000, Garching, German
    • 

    corecore