Search CORE

25,411 research outputs found

Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data

Author: Frank eWessely
Harry eClifford
Richard D Emes
Satish ePendurthi
Satish ePendurthi
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of groupings of data without user bias. However no consensus on the methods to use for clustering of methylation array approaches has been reached. To determine the most appropriate clustering method for analysis of illumina array methylation data, a collection of data sets was simulated and used to compare clustering methods. Both hierarchical clustering and non-hierarchical clustering methods (k-means, k-medoids, and fuzzy clustering algorithms) were compared using a range of distance and linkage methods. As no single method consistently outperformed others across different simulations, we propose a method to capture the best clustering outcome based on an additional measure, the silhouette width. This approach produced a consistently higher cluster accuracy compared to using any one method in isolation

Directory of Open Access Journals

Frontiers - Publisher Connector

KLASTERISASI DAN ANALISIS TRAFIK INTERNET MENGGUNAKAN FUZZY C MEAN DENGAN EKSTRAKSI FITUR DATA

Author: Hindayanto Bekti Cahyo
Samopa Febriliyan
Suryaputra P. Adi
Publication venue: 'Petra Christian University'
Publication date: 01/01/2014
Field of study

Internet facilities is one important part of the infrastructure of the campus at this time. Internet facility is a part of teaching and learning activities. Important part of the internet facility is the internet bandwidth, which is often deemed less bandwidth for certain majors at certain hours of lecture hours especially active. To overcome this there needs to be an analysis and clustering of the internet traffic at each point where the distribution of bandwidth is done so that in the end can provide information that can support decision granting bandwidth at each point there. One algorithm for clustering algorithms used are Fuzzy C-Mean, in which the clustering process before the beginning of the internet bandwidth usage data that exists in one period will be collected to be input to the Fuzzy C-Mean algorithm for the distribution of clusters on the use of existing bandwidth based applications that use the internet and network users. But the initial dataset that of the Fuzzy C Mean is not optimal, so we need some optimization dataset using feature extraction data so that the resulting clusters by Fuzzy C Mean algorithm has the accurate output. Results to be obtained from this study is the extraction of feature data that is most appropriate to perform clustering and analysis of Internet traffic based on user applications and the amount of capacity used by the user, which information the clustering results can be used to optimize internet bandwidt

Directory of Open Access Journals

Comparison of different strategies of utilizing fuzzy clustering in structure identification

Author: Kılıç Kemal
Türkşen I. Burhan
Uncu Özge
Publication venue: 'Elsevier BV'
Publication date: 01/12/2007
Field of study

Fuzzy systems approximate highly nonlinear systems by means of fuzzy "if-then" rules. In the literature, various algorithms are proposed for mining. These algorithms commonly utilize fuzzy clustering in structure identification. Basically, there are three different approaches in which one can utilize fuzzy clustering; the �first one is based on input space clustering, the second one considers clustering realized in the output space, while the third one is concerned with clustering realized in the combined input-output space. In this study, we analyze these three approaches. We discuss each of the algorithms in great detail and o¤er a thorough comparative analysis. Finally, we compare the performances of these algorithms in a medical diagnosis classi�cation problem, namely Aachen Aphasia Test. The experiment and the results provide a valuable insight about the merits and the shortcomings of these three clustering approaches