6,470 research outputs found
On the Suitability of Genetic-Based Algorithms for Data Mining
Data mining has as goal to extract knowledge from large databases. A database may be considered as a search space consisting of an enormous number of elements, and a mining algorithm as a search strategy. In general, an exhaustive search of the space is infeasible. Therefore, efficient search strategies are of vital importance. Search strategies on genetic-based algorithms have been applied successfully in a wide range of applications. We focus on the suitability of genetic-based algorithms for data mining. We discuss the design and implementation of a genetic-based algorithm for data mining and illustrate its potentials
A customizable multi-agent system for distributed data mining
We present a general Multi-Agent System framework for
distributed data mining based on a Peer-to-Peer model. Agent
protocols are implemented through message-based asynchronous
communication. The framework adopts a dynamic load balancing
policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
- …