2 research outputs found

    Data Stream Clustering: Challenges and Issues

    Full text link
    Very large databases are required to store massive amounts of data that are continuously inserted and queried. Analyzing huge data sets and extracting valuable pattern in many applications are interesting for researchers. We can identify two main groups of techniques for huge data bases mining. One group refers to streaming data and applies mining techniques whereas second group attempts to solve this problem directly with efficient algorithms. Recently many researchers have focused on data stream as an efficient strategy against huge data base mining instead of mining on entire data base. The main problem in data stream mining means evolving data is more difficult to detect in this techniques therefore unsupervised methods should be applied. However, clustering techniques can lead us to discover hidden information. In this survey, we try to clarify: first, the different problem definitions related to data stream clustering in general; second, the specific difficulties encountered in this field of research; third, the varying assumptions, heuristics, and intuitions forming the basis of different approaches; and how several prominent solutions tackle different problems. Index Terms- Data Stream, Clustering, K-Means, Concept driftComment: IMECS201

    A New Augmented K-Means Algorithm for Seed Segmentation in Microscopic Images of the Colon Cancer

    Get PDF
    In this study, we analyze histologic human colon tissue images that we captured with a camera-mounted microscope. We propose the Augmented K-Means Clustering algorithm as a method of segmenting cell nuclei in such colon images. Then we compare the proposed algorithm to the weighted K-Means Clustering algorithm. As a result, we observe that the developed Augmented K-Means Clustering algorithm decreased the needed number of iterations and shortened the duration of the segmentation process. Moreover, the algorithm we propose appears more consistent in comparison to the weighted K-Means Clustering algorithm. We also assess the similarity of the segmented images to the original images, for which we used the Histogram-Based Similarity method. Our assessment indicates that the images segmented by the Augmented K-Means Clustering algorithm are more frequently similar to the original images than the images segmented by the Weighed K-Means Clustering algorithm
    corecore