Search CORE

6,223 research outputs found

Recommended from our members

A niching memetic algorithm for simultaneous clustering and feature selection

Author: Fairhurst M
Liu X
Sheng W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2008
Field of study

Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data

Brunel University Research Archive

Batch and median neural gas

Author: Alexander Hasenfuß
Barbara Hammer
Belkin
Blake
Borg
Bottou
Bunke
Cheng
Cottrell
Cottrell
Duda
Fort
Graepel
Guenter
Hammer
Heskes
Kaski
Kohonen
Kohonen
Lundsteen
Marie Cottrell
Martinetz
Martinetz
Mevissen
Murty
Ripley
Seo
Somervuo
Thomas Villmann
Villmann
Zhong
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Neural Gas (NG) constitutes a very robust clustering algorithm given euclidian data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced. We prove convergence of batch and median versions of NG, SOM, and k-means in a unified formulation, and we investigate the behavior of the algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari

arXiv.org e-Print Archive

CiteSeerX

Crossref

Publications at Bielefeld University

HAL-Paris1

Denver Groups Classification of Human Chromosomes Using Fuzzy C-Means Clustering

Author: Can Mehmet
Publication venue: International University of Sarajevo
Publication date: 30/03/2013
Field of study

Unbanded human chromosome can be classified into seven Denver Groups (A-G) based their lengths and the ratio of the length of the shorter arm to the whole length of the chromosome, which is called the centromere index (CI). In this article, the fuzzy c-means method will be used to perform the Denver Group classification of a given set of human chromosomes. The objective in clustering is to partition a given human chromosome set into homogeneous clusters; by homogeneous we mean that all points in the same cluster share similar attributes and they do not share similar attributes with points in other clusters. However, the separation of clusters and the meaning of similarity are fuzzy notions and can be described as such. It is found that the clusters iterations converge, highly depend on the initial partition matrix

Inquiry (E-Journal - Faculty of Business and Administration, International University of Sarajevo)

The detection of globular clusters in galaxies as a data mining problem

Author: Bassino
Bishop
Broyden
Byrd
Carlson
Chang
Davidon
Dirsch
Duda
Dunn
Fletcher
Giuseppe Longo
Goldfarb
Holland
Kotsiantis
Kundu
Massimo Brescia
Maurizio Paolillo
Meng
Paolillo
Peng
Rubinstein
Shanno
Stefano Cavuoti
Sutton
Thomas Puzia
Yang
Zhu
Publication venue: 'Wiley'
Publication date: 16/12/2011
Field of study

We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original form 2011 October 1

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

OA@INAF - Istituto Nazionale di Astrofisica

Caltech Authors

Recommended from our members

Recursive Percentage based Hybrid Pattern Training for Supervised Learning

Author: Guan SU
Kiruthika R
Publication venue: 'Dynamic Publishers'
Publication date: 01/01/2007
Field of study

Supervised learning algorithms, often used to find the I/O relationship in data, have the tendency to be trapped in local optima as opposed to the desirable global optima. In this paper, we discuss the RPHP learning algorithm. The algorithm uses Real Coded Genetic Algorithm based global and local searches to find a set of pseudo global optimal solutions. Each pseudo global optimum is a local optimal solution from the point of view of all the patterns but globally optimal from the point of view of a subset of patterns. Together with RPHP, a Kth nearest neighbor algorithm is used as a second level pattern distributor to solve a test pattern. We also show theoretically the condition under which finding several pseudo global optimal solutions requires a shorter training time than finding a single global optimal solution. As the difficulty of curve fitting problems is easily estimated, we verify the capability of the RPHP algorithm against them and compare the RPHP algorithm with three counterparts to show the benefits of hybrid learning and active recursive subset selection. The RPHP shows a clear superiority in performance. We conclude our paper by identifying possible loopholes in the RPHP algorithm and proposing possible solutions

Brunel University Research Archive