2 research outputs found
Parallelization of Partitioning Around Medoids (PAM) in K-Medoids Clustering on GPU
K-medoids clustering is categorized as partitional clustering. K-medoids offers better result when dealing with outliers and arbitrary distance metric also in the situation when the mean or median does not exist within data. However, k-medoids suffers a high computational complexity. Partitioning Around Medoids (PAM) has been developed to improve k-medoids clustering, consists of build and swap steps and uses the entire dataset to find the best potential medoids. Thus, PAM produces better medoids than other algorithms. This research proposes the parallelization of PAM in k-medoids clustering on GPU to reduce computational time at the swap step of PAM. The parallelization scheme utilizes shared memory, reduction algorithm, and optimization of the thread block configuration to maximize the occupancy. Based on the experiment result, the proposed parallelized PAM k-medoids is faster than CPU and Matlab implementation and efficient for large dataset
Parallelization of Partitioning Around Medoids (PAM) in K-Medoids Clustering on GPU
K-medoids clustering is categorized as partitional clustering. K-medoids
offers better result when dealing with outliers and arbitrary distance metric
also in the situation when the mean or median does not exist within data.
However, k-medoids suffers a high computational complexity. Partitioning
Around Medoids (PAM) has been developed to improve k-medoids
clustering, consists of build and swap steps and uses the entire dataset to find
the best potential medoids. Thus, PAM produces better medoids than other
algorithms. This research proposes the parallelization of PAM in k-medoids
clustering on GPU to reduce computational time at the swap step of PAM.
The parallelization scheme utilizes shared memory, reduction algorithm, and
optimization of the thread block configuration to maximize the occupancy.
Based on the experiment result, the proposed parallelized PAM k-medoids is
faster than CPU and Matlab implementation and efficient for large dataset