33 research outputs found

    Large-scale parallel data clustering

    Full text link

    A Parallel Fuzzy C-Mean algorithm for Image Segmentation

    Get PDF
    This paper proposes a parallel Fuzzy C-Mean (FCM) algorithm for image segmentation. The sequential FCM algorithm is computationally intensive and has significant memory requirements. For many applications such as medical image segmentation and geographical image analysis that deal with large size images, sequential FCM is very slow. In our parallel FCM algorithm, dividing the computations among the processors and minimizing the need for accessing secondary storage, enhance the performance and efficiency of image segmentation task as compared to the sequential algorithm. such as medical image segmentation and geographical image analysis that deal with large size images, sequenrial FCM is very slow. In our parallel FCM algorithm, dividing the computations among the processors and minimizing the need for accessing secondary storage, enhance the performance and efficiency of image segmentation task as compared to the sequential algorith

    Clustering Without Knowing How To: Application and Evaluation

    Full text link
    Crowdsourcing allows running simple human intelligence tasks on a large crowd of workers, enabling solving problems for which it is difficult to formulate an algorithm or train a machine learning model in reasonable time. One of such problems is data clustering by an under-specified criterion that is simple for humans, but difficult for machines. In this demonstration paper, we build a crowdsourced system for image clustering and release its code under a free license at https://github.com/Toloka/crowdclustering. Our experiments on two different image datasets, dresses from Zalando's FEIDEGGER and shoes from the Toloka Shoes Dataset, confirm that one can yield meaningful clusters with no machine learning algorithms purely with crowdsourcing.Comment: accepted at ECIR 2023 Demonstration Trac

    The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering (Extended Version)

    Full text link
    This paper is about partitioning in parallel and distributed simulation. That means decomposing the simulation model into a numberof components and to properly allocate them on the execution units. An adaptive solution based on self-clustering, that considers both communication reduction and computational load-balancing, is proposed. The implementation of the proposed mechanism is tested using a simulation model that is challenging both in terms of structure and dynamicity. Various configurations of the simulation model and the execution environment have been considered. The obtained performance results are analyzed using a reference cost model. The results demonstrate that the proposed approach is promising and that it can reduce the simulation execution time in both parallel and distributed architectures
    corecore