Search CORE

171 research outputs found

Document clustering for knowledge discovery using nature-inspired algorithm

Author: Husni Husniza
Mohammed Athraa Jasim
Yusof Yuhanis
Publication venue
Publication date: 12/08/2014
Field of study

As the internet is overload with information, various knowledge based systems are now equipped with data analytics features that facilitate knowledge discovery.This includes the utilization of optimization algorithms that mimics the behavior of insects or animals.This paper presents an experiment on document clustering utilizing the Gravitation Firefly algorithm (GFA).The advantage of GFA is that clustering can be performed without a pre-defined value of k clusters.GFA determines the center of clusters by identifying documents with high force.Upon identification of the centers, clusters are created based on cosine similarity measurement.Experimental results demonstrated that GFA utilizing a random positioning of documents outperforms existing clustering algorithm such as Particles Swarm Optimization (PSO) and K-means

UUM Repository

Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach

Author: Bouhafer Fadwa
Boulouard Zakaria
Dousset Bernard
El Haddadi Amine
El Haddadi Anass
Koutti Lahcen
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/04/2018
Field of study

Defining the correct number of clusters is one of the most fundamental tasks in graph clustering. When it comes to large graphs, this task becomes more challenging because of the lack of prior information. This paper presents an approach to solve this problem based on the Bat Algorithm, one of the most promising swarm intelligence based algorithms. We chose to call our solution, “Bat-Cluster (BC).” This approach allows an automation of graph clustering based on a balance between global and local search processes. The simulation of four benchmark graphs of different sizes shows that our proposed algorithm is efficient and can provide higher precision and exceed some best-known values

IAES journal

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Consensus clustering with differential evolution

Author: Sabo Miroslav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

summary:Consensus clustering algorithms are used to improve properties of traditional clustering methods, especially their accuracy and robustness. In this article, we introduce our approach that is based on a refinement of the set of initial partitions and uses differential evolution algorithm in order to find the most valid solution. Properties of the algorithm are demonstrated on four benchmark datasets

Institute of Mathematics AS CR, v. v. i.

Recommended from our members

Parallel swarm intelligence strategies for large-scale clustering based on MapReduce with application to epigenetics of aging

Author: Batouche M
Benmounah Z
Lio P
Meshoul S
Publication venue: Applied Soft Computing
Publication date: 01/01/2018
Field of study

Clustering is an important technique for data analysis and knowledge discovery. In the context of big data, it becomes a challenging issue due to the huge amount of data recently collected making conventional clustering algorithms inappropriate. The use of swarm intelligence algorithms has shown promising results when applied to data clustering of moderate size due to their decentralized and self-organized behavior. However, these algorithms exhibit limited capabilities when large data sets are involved. In this paper, we developed a decentralized distributed big data clustering solution using three swarm intelligence algorithms according to MapReduce framework. The developed framework allows cooperation between the three algorithms namely particle swarm optimization, ant colony optimization and artificial bees colony to achieve largely scalable data partitioning through a migration strategy. This latter reaps advantage of the combined exploration and exploitation capabilities of these algorithms to foster diversity. The framework is tested using amazon elastic map-reduce service (EMR) deploying up to 192 computer nodes and 30 gigabytes of data. Parallel metrics such as speed-up, size-up and scale-up are used to measure the elasticity and scalability of the framework. Our results are compared with their counterparts big data clustering results and show a significant improvement in terms of time and convergence to good quality solution. The developed model has been applied to epigenetics data clustering according to methylation features in CpG islands, gene body, and gene promoter in order to study the epigenetics impact on aging. Experimental results reveal that DNA-methylation changes slightly and not aberrantly with aging corroborating previous studies

Apollo (Cambridge)

Multilevel thresholding hyperspectral image segmentation based on independent component analysis and swarm optimization methods

Author: Mardhia Murein Miksa
Murinto Murinto
Puji Astuti Nur Rochmah Dyah
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 01/03/2019
Field of study

High dimensional problems are often encountered in studies related to hyperspectral data. One of the challenges that arise is how to find representations that are accurate so that important structures can be clearly easily. This study aims to process segmentation of hyperspectral image by using swarm optimization techniques. This experiments use Aviris Indian Pines hyperspectral image dataset that consist of 103 bands. The method used for segmentation image is particle swarm optimization (PSO), Darwinian particle swarm optimization (DPSO) and fractional order Darwinian particle swarm optimization (FODPSO). Before process segmentation image, the dimension of the hyperspectral image data set are first reduced by using independent component analysis (ICA) technique to get first independent component. The experimental show that FODPSO method is better than PSO and DPSO, in terms of the average CPU processing time and best fitness value. The PSNR and SSIM values when using FODPSO are better than the other two swarm optimization method. It can be concluded that FODPSO method is better in order to obtain better segmentation results compared to the previous method

Universitas Ahmad Dahlan Repository

International Journal of Advances in Intelligent Informatics

Directory of Open Access Journals

International Journal of Advances in Intelligent Informatics (IJAIN)

A review of clustering techniques and developments

Author: Bharill N
Ding W
Er MJ
Gupta A
Lin CT
Patel OP
Prasad M
Saxena A
Tiwari A
Publication venue: 'Elsevier BV'
Publication date: 06/12/2017
Field of study

© 2017 Elsevier B.V. This paper presents a comprehensive study on clustering: exiting methods and developments made at various times. Clustering is defined as an unsupervised learning where the objects are grouped on the basis of some similarity inherent among them. There are different methods for clustering the objects such as hierarchical, partitional, grid, density based and model based. The approaches used in these methods are discussed with their respective states of art and applicability. The measures of similarity as well as the evaluation criteria, which are the central components of clustering, are also presented in the paper. The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted

OPUS - University of Technology Sydney