9 research outputs found
Recommended from our members
Variable neighbourhood search based heuristic for K-harmonic means clustering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Although there has been a rapid development of technology and increase of computation speeds, most of the real-world optimization problems still cannot be solved in a reasonable time. Some times it is impossible for them to be optimally solved, as there are many instances of real problems which cannot be addressed by computers at their present speed. In such cases, the heuristic approach can be used. Heuristic research has been used by many researchers to supply this need. It gives a sufficient solution in reasonable time. The clustering problem is one example of this, formed in many applications.
In this thesis, I suggest a Variable Neighbourhood Search (VNS) to improve a recent clustering local search called K-Harmonic Means (KHM).Many experiments are presented to show the strength of my code compared with some algorithms from the literature.
Some counter-examples are introduced to show that KHM may degenerate entirely, in either one or more runs. Furthermore, it degenerates and then stops in some familiar datasets, which significantly affects the final solution. Hence, I present a removing degeneracy code for KHM. I also apply VNS to improve the code of KHM after removing the evidence of degeneracy
Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering
The conventional k-modes algorithm and its variants have been extensively used for categorical data clustering. However, these algorithms have some drawbacks, e.g., they can be trapped into local optima and sensitive to initial clusters/modes. Our numerical experiments even showed that the k-modes algorithm could not identify the optimal clustering results for some special datasets regardless the selection of the initial centers. In this paper, we developed an integer linear programming (ILP) approach for the k-modes clustering, which is independent to the initial solution and can obtain directly the optimal results for small-sized datasets. We also developed a heuristic algorithm that implements iterative partial optimization in the ILP approach based on a framework of variable neighborhood search, known as IPO-ILP-VNS, to search for near-optimal results of medium and large sized datasets with controlled computing time. Experiments on 38 datasets, including 27 synthesized small datasets and 11 known benchmark datasets from the UCI site were carried out to test the proposed ILP approach and the IPO-ILP-VNS algorithm. The experimental results outperformed the conventional and other existing enhanced k-modes algorithms in literature, updated 9 of the UCI benchmark datasets with new and improved results
New heuristic for harmonic means clustering
It is well known that some local search heuristics for K-clustering problems, such
as k-means heuristic for minimum sum-of-squares clustering occasionally stop at a solution
with a smaller number of clusters than the desired number K. Such solutions are called
degenerate. In this paper, we reveal that the degeneracy also exists in K-harmonic means
(KHM) method, proposed as an alternative to K-means heuristic, but which is less sensitive
to the initial solution. In addition, we discover two types of degenerate solutions and provide
examples for both. Based on these findings, we give a simple method to remove degeneracy
during the execution of the KHM heuristic; it can be used as a part of any other heuristic
for KHM clustering problem. We use KHM heuristic within a recent variant of variable
neighborhood search (VNS) based heuristic. Extensive computational analysis, performed on
test instances usually used in the literature, shows that significant improvements are obtained
if our simple degeneracy correcting method is used within both KHM and VNS. Moreover,
our VNS based heuristic suggested here may be considered as a new state-of-the-art heuristic
for solving KHM clustering problem
Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids
Previous methods for predicting protein-protein interactions (PPIs) were mainly focused on PPIs within a single species, but PPIs across different species have recently emerged as an important issue in some areas such as viral infection. The primary focus of this study is to predict PPIs between virus and its targeted host, which are involved in viral infection. We developed a general method that predicts interactions between virus and host proteins using the repeat patterns and composition of amino acids. In independent testing of the method with PPIs of new viruses and hosts, it showed a high performance comparable to the best performance of other methods for single virus-host PPIs. In comparison of our method with others using same datasets, our method outperformed the others. The repeat patterns and composition of amino acids are simple, yet powerful features for predicting virus-host PPIs. The method developed in this study will help in finding new virus-host PPIs for which little information is available