6,429 research outputs found
Clustering analysis of railway driving missions with niching
A wide number of applications requires classifying or grouping data into a set of categories or
clusters. Most popular clustering techniques to achieve this objective are K-means clustering and
hierarchical clustering. However, both of these methods necessitate the a priori setting of the cluster
number. In this paper, a clustering method based on the use of a niching genetic algorithm is presented,
with the aim of finding the best compromise between the inter-cluster distance maximization and the
intra-cluster distance minimization. This method is applied to three clustering benchmarks and to the
classification of driving missions for railway applications
Recommended from our members
A niching memetic algorithm for simultaneous clustering and feature selection
Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data
Learning Appropriate Contexts
Genetic Programming is extended so that the solutions being evolved do so in the context of local domains within the total
problem domain. This produces a situation where different species of solution develop to exploit different niches of the
problem indicating exploitable solutions. It is argued that for context to be fully learnable a further step of abstraction is
necessary. Such contexts abstracted from clusters of solution/model domains make sense of the problem of how to identify
when it is the content of a model is wrong and when it is the context. Some principles of learning to identify useful contexts
are proposed
Fitness sharing and niching methods revisited
Interest in multimodal optimization function is expanding rapidly since real-world optimization problems often require the location of multiple optima in the search space. In this context, fitness sharing has been used widely to maintain population diversity and permit the investigation of many peaks in the feasible domain. This paper reviews various strategies of sharing and proposes new recombination schemes to improve its efficiency. Some empirical results are presented for high and a limited number of fitness function evaluations. Finally, the study
compares the sharing method with other niching techniques
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
Regulatory motif discovery using a population clustering evolutionary algorithm
This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences
Dynamic instability transitions in 1D driven diffusive flow with nonlocal hopping
One-dimensional directed driven stochastic flow with competing nonlocal and
local hopping events has an instability threshold from a populated phase into
an empty-road (ER) phase. We implement this in the context of the asymmetric
exclusion process. The nonlocal skids promote strong clustering in the
stationary populated phase. Such clusters drive the dynamic phase transition
and determine its scaling properties. We numerically establish that the
instability transition into the ER phase is second order in the regime where
the entry point reservoir controls the current and first order in the regime
where the bulk is in control. The first order transition originates from a
turn-about of the cluster drift velocity. At the critical line, the current
remains analytic, the road density vanishes linearly, and fluctuations scale as
uncorrelated noise. A self-consistent cluster dynamics analysis explains why
these scaling properties remain that simple.Comment: 11 pages, 14 figures (25 eps files); revised as the publised versio
Multi-objective evolutionary algorithms for data clustering
In this work we investigate the use of Multi-Objective metaheuristics for the data-mining task of clustering. We �first investigate methods of evaluating the quality of
clustering solutions, we then propose a new Multi-Objective clustering algorithm driven by multiple measures of cluster quality and then perform investigations into the performance of different Multi-Objective clustering algorithms.
In the context of clustering, a robust measure for evaluating clustering solutions is an important component of an algorithm. These Cluster Quality Measures (CQMs)
should rely solely on the structure of the clustering solution. A robust CQM should have three properties: it should be able to reward a \good" clustering solution; it
should decrease in value monotonically as the solution quality deteriorates and, it should be able to evaluate clustering solutions with varying numbers of clusters. We
review existing CQMs and present an experimental evaluation of their robustness. We find that measures based on connectivity are more robust than other measures
for cluster evaluation.
We then introduce a new Multi-Objective Clustering algorithm (MOCA). The use of Multi-Objective optimisation in clustering is desirable because it permits the
incorporation of multiple measures of cluster quality. Since the definition of what constitutes a good clustering is far from clear, it is beneficial to develop algorithms that allow for multiple CQMs to be accommodated. The selection of the clustering quality measures to use as objectives for MOCA is informed by our previous work with internal evaluation measures. We explain the implementation details and perform experimental work to establish its worth. We compare MOCA with k-means and find some promising results. We�find that MOCA can generate a pool of clustering solutions that is more likely to contain the optimal clustering solution than the pool of solutions generated by k-means.
We also perform an investigation into the performance of different implementations of MOEA algorithms for clustering. We�find that representations of clustering
based around centroids and medoids produce more desirable clustering solutions and Pareto fronts. We also �find that mutation operators that greatly disrupt the
clustering solutions lead to better exploration of the Pareto front whereas mutation operators that modify the clustering solutions in a more moderate way lead to higher quality clustering solutions.
We then perform more specific investigations into the performance of mutation operators focussing on operators that promote clustering solution quality, exploration of the Pareto front and a hybrid combination. We use a number of techniques to assess the performance of the mutation operators as the algorithms execute. We
confirm that a disruptive mutation operator leads to better exploration of the Pareto front and mutation operators that modify the clustering solutions lead to the discovery of higher quality clustering solutions. We find that our implementation of a hybrid mutation operator does not lead to a good improvement with respect to the other mutation operators but does show promise for future work
- …