CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
An efficient automated incremental density-based algorithm for clustering and classification
Authors
E. Azhir
A. Darwesh
+3 more
M. Hosseinzadeh
N. Jafari Navimipour
A. Sharifi
Publication date
1 January 2021
Publisher
Elsevier B.V.
Abstract
Data clustering divides the datasets into different groups. Incremental Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a famous density-based clustering technique able to find the clusters of variable sizes and shapes. The quality of incremental DBSCAN results has been influenced by two input parameters: MinPts (Minimum Points) and Eps (Epsilon). Therefore, the parameter setting is one of the major problems of incremental DBSCAN. In the present article, an improved incremental DBSCAN accorded to Non-dominated Sorting Genetic Algorithm II (NSGA-II) has been presented to address the issue. The proposed algorithm adjusts the two parameters (MinPts and Eps) of the incremental DBSCAN via the iteration and the fitness functions to enhance the clustering precision. Moreover, our proposed method introduces suitable fitness functions for both labeled and unlabeled datasets. We have also improved the efficiency of the proposed hybrid algorithm by parallelization of the optimization process. The evaluation of the introduced method has been done through some textual and numerical datasets with different shapes, sizes, and dimensions. According to the experimental results, the proposed algorithm provides better results than Multi-Objective Particle Swarm Optimization (MOPSO) based incremental DBSCAN and a few well-known techniques, particularly regarding the shape and balanced datasets. Also, good speed-up can be reached with a parallel model compared with the serial version of the algorithm. © 2020 Elsevier B.V
Similar works
Full text
Available Versions
eprints Iran University of Medical Sciences
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:eprints.iums.ac.ir:33180
Last time updated on 15/04/2021