25,411 research outputs found

    Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data

    Get PDF
    The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of groupings of data without user bias. However no consensus on the methods to use for clustering of methylation array approaches has been reached. To determine the most appropriate clustering method for analysis of illumina array methylation data, a collection of data sets was simulated and used to compare clustering methods. Both hierarchical clustering and non-hierarchical clustering methods (k-means, k-medoids, and fuzzy clustering algorithms) were compared using a range of distance and linkage methods. As no single method consistently outperformed others across different simulations, we propose a method to capture the best clustering outcome based on an additional measure, the silhouette width. This approach produced a consistently higher cluster accuracy compared to using any one method in isolation

    KLASTERISASI DAN ANALISIS TRAFIK INTERNET MENGGUNAKAN FUZZY C MEAN DENGAN EKSTRAKSI FITUR DATA

    Get PDF
    Internet facilities is one important part of the infrastructure of the campus at this time. Internet facility is a part of teaching and learning activities. Important part of the internet facility is the internet bandwidth, which is often deemed less bandwidth for certain majors at certain hours of lecture hours especially active. To overcome this there needs to be an analysis and clustering of the internet traffic at each point where the distribution of bandwidth is done so that in the end can provide information that can support decision granting bandwidth at each point there. One algorithm for clustering algorithms used are Fuzzy C-Mean, in which the clustering process before the beginning of the internet bandwidth usage data that exists in one period will be collected to be input to the Fuzzy C-Mean algorithm for the distribution of clusters on the use of existing bandwidth based applications that use the internet and network users. But the initial dataset that of the Fuzzy C Mean is not optimal, so we need some optimization dataset using feature extraction data so that the resulting clusters by Fuzzy C Mean algorithm has the accurate output. Results to be obtained from this study is the extraction of feature data that is most appropriate to perform clustering and analysis of Internet traffic based on user applications and the amount of capacity used by the user, which information the clustering results can be used to optimize internet bandwidt

    Comparison of different strategies of utilizing fuzzy clustering in structure identification

    Get PDF
    Fuzzy systems approximate highly nonlinear systems by means of fuzzy "if-then" rules. In the literature, various algorithms are proposed for mining. These algorithms commonly utilize fuzzy clustering in structure identification. Basically, there are three different approaches in which one can utilize fuzzy clustering; the ļæ½first one is based on input space clustering, the second one considers clustering realized in the output space, while the third one is concerned with clustering realized in the combined input-output space. In this study, we analyze these three approaches. We discuss each of the algorithms in great detail and oĀ¤er a thorough comparative analysis. Finally, we compare the performances of these algorithms in a medical diagnosis classiļæ½cation problem, namely Aachen Aphasia Test. The experiment and the results provide a valuable insight about the merits and the shortcomings of these three clustering approaches
    • ā€¦
    corecore