Search CORE

19,689 research outputs found

Techniques for clustering gene expression data

Author: Crane Martin
Doolan Padraig
Kerr Gráinne
Ruskin Heather J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

CiteSeerX

Irish Universities

DCU Online Research Access Service

A similarity-based community detection method with multiple prototype representation

Author: Martin Arnaud
Pan Quan
Zhou Kuang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Communities are of great importance for understanding graph structures in social networks. Some existing community detection algorithms use a single prototype to represent each group. In real applications, this may not adequately model the different types of communities and hence limits the clustering performance on social networks. To address this problem, a Similarity-based Multi-Prototype (SMP) community detection approach is proposed in this paper. In SMP, vertices in each community carry various weights to describe their degree of representativeness. This mechanism enables each community to be represented by more than one node. The centrality of nodes is used to calculate prototype weights, while similarity is utilized to guide us to partitioning the graph. Experimental results on computer generated and real-world networks clearly show that SMP performs well for detecting communities. Moreover, the method could provide richer information for the inner structure of the detected communities with the help of prototype weights compared with the existing community detection models

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

A hierarchical Mamdani-type fuzzy modelling approach with new training data selection and multi-objective optimisation mechanisms: A special application for the prediction of mechanical properties of alloy steels

Author: Alcala
Bakshi
Bezdek
Chan
Chen
Chen
Chen
Cococcioni
Cordon
De Castro
Delgado
Dieter
Dorigo
Eberhart
Gacto
Glover
Goldberg
Gomez-Skarmeta
Ishibuchi
Ishibuchi
Jain
Jang
Jin
Jin
Johansen
Kennedy
Kwong
Mahdi Mahfouf
Mamdani
Pickering
Qian Zhang
Rojas
Setnes
Setnes
Sugeno
Takagi
Wang
Wang
Wang
Wang
Yen
Yen
Yoshinari
Zadeh
Zadeh
Zadeh
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/03/2011
Field of study

In this paper, a systematic data-driven fuzzy modelling methodology is proposed, which allows to construct Mamdani fuzzy models considering both accuracy (precision) and transparency (interpretability) of fuzzy systems. The new methodology employs a fast hierarchical clustering algorithm to generate an initial fuzzy model efficiently; a training data selection mechanism is developed to identify appropriate and efficient data as learning samples; a high-performance Particle Swarm Optimisation (PSO) based multi-objective optimisation mechanism is developed to further improve the fuzzy model in terms of both the structure and the parameters; and a new tolerance analysis method is proposed to derive the confidence bands relating to the final elicited models. This proposed modelling approach is evaluated using two benchmark problems and is shown to outperform other modelling approaches. Furthermore, the proposed approach is successfully applied to complex high-dimensional modelling problems for manufacturing of alloy steels, using ‘real’ industrial data. These problems concern the prediction of the mechanical properties of alloy steels by correlating them with the heat treatment process conditions as well as the weight percentages of the chemical compositions

Crossref

Kent Academic Repository

Simple Measures of Individual Cluster-Membership Certainty for Hard Partitional Clustering

Author: Graham Jinko
Liu Dongmeng
Publication venue
Publication date: 21/01/2018
Field of study

We propose two probability-like measures of individual cluster-membership certainty which can be applied to a hard partition of the sample such as that obtained from the Partitioning Around Medoids (PAM) algorithm, hierarchical clustering or k-means clustering. One measure extends the individual silhouette widths and the other is obtained directly from the pairwise dissimilarities in the sample. Unlike the classic silhouette, however, the measures behave like probabilities and can be used to investigate an individual's tendency to belong to a cluster. We also suggest two possible ways to evaluate the hard partition. We evaluate the performance of both measures in individuals with ambiguous cluster membership, using simulated binary datasets that have been partitioned by the PAM algorithm or continuous datasets that have been partitioned by hierarchical clustering and k-means clustering. For comparison, we also present results from soft clustering algorithms such as soft analysis clustering (FANNY) and two model-based clustering methods. Our proposed measures perform comparably to the posterior-probability estimators from either FANNY or the model-based clustering methods. We also illustrate the proposed measures by applying them to Fisher's classic iris data set

arXiv.org e-Print Archive

FigShare

A Short Survey on Data Clustering Algorithms

Author: Wong Ka-Chun
Publication venue
Publication date: 25/11/2015
Field of study

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end

arXiv.org e-Print Archive

Crossref

A new fuzzy set merging technique using inclusion-based fuzzy clustering

Author: Kaymak U
Nefti-Meziani S
Oussalah M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper proposes a new method of merging parameterized fuzzy sets based on clustering in the parameters space, taking into account the degree of inclusion of each fuzzy set in the cluster prototypes. The merger method is applied to fuzzy rule base simplification by automatically replacing the fuzzy sets corresponding to a given cluster with that pertaining to cluster prototype. The feasibility and the performance of the proposed method are studied using an application in mobile robot navigation. The results indicate that the proposed merging and rule base simplification approach leads to good navigation performance in the application considered and to fuzzy models that are interpretable by experts. In this paper, we concentrate mainly on fuzzy systems with Gaussian membership functions, but the general approach can also be applied to other parameterized fuzzy sets

University of Salford Institutional Repository

Crossref

University of Birmingham Research Portal

Pure OAI Repository

EUR Research Repository