125 research outputs found

    An Efficient Approach to Clustering in Large Multimedia Databases with Noise".

    Get PDF
    Abstract Several clustering algorithms can be applied to clustering in large multimedia databases. The effectiveness and efficiency of the existing algorithms, however, is somewhat limited, since clustering in multimedia databases requires clustering high-dimensional feature vectors and since multimedia databases often contain large amounts of noise. In this paper, we therefore introduce a new algorithm to clustering in large multimedia databases called DENCLUE (DENsitybased CLUstEring). The basic idea of our new approach is to model the overall point density analytically as the sum of influence functions of the data points. Clusters can then be identified by determining density-attractors and clusters of arbitrary shape can be easily described by a simple equation based on the overall density function. The advantages of our new approach are (1) it has a firm mathematical basis, (2) it has good clustering properties in data sets with large amounts of noise, (3) it allows a compact mathematical description of arbitrarily shaped clusters in high-dimensional data sets and (4) it is significantly faster than existing algorithms. To demonstrate the effectiveness and efficiency of DENCLUE, we perform a series of experiments on a number of different data sets from CAD and molecular biology. A comparison with DBSCAN shows the superiority of our new approach

    Detecting Stops from GPS Trajectories: A Comparison of Different GPS Indicators for Raster Sampling Methods

    Get PDF
    With the increasing prevalence of GPS tracking capabilities on smartphones, GPS trajectories have proven to be useful for an extensive range of research topics. Stop detection, which estimates activity locations, is fundamental for organizing GPS trajectories into semantically meaningful journeys. With previous methods overwhelmingly dependent on thresholds, contextual information or a pre-understanding of the GPS records, this paper addresses the challenge by contributing a ‘top-down’ raster sampling method which samples pre-calculated GPS indicators and clusters the raster cells with significantly different values as stops. We report a comparison of a set of precalculated GPS indicators with two baseline methods. By referencing a ground truth travel dairy, the raster sampling method demonstrates good and reliable capabilities on producing high accuracy, low redundancy and close proximity to the ground truth in three distinct travel use cases. This further indicates a good generic stop detection method

    ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities

    Full text link
    Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes of this algorithm is noise cancellation. However, DBSCAN demonstrates reduced performances for clusters with different densities. Therefore, in this paper, an adaptive DBSCAN is proposed which can work significantly well for identifying clusters with varying densities.Comment: To be published in the 4th IEEE International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT 2018

    Partitioning Clustering Based on Support Vector Ranking

    Get PDF
    Postprin
    corecore