Search CORE

2,799 research outputs found

Some Clustering Methods, Algorithms and their Applications

Author: Sivakumar K.
Velunachiyar S.
Publication venue: Auricle Global Society of Education and Research
Publication date: 12/06/2023
Field of study

Clustering is a type of unsupervised learning [15]. When no target values are known, or "supervisors," in an unsupervised learning task, the purpose is to produce training data from the inputs themselves. Data mining and machine learning would be useless without clustering. If you utilize it to categorize your datasets according to their similarities, you'll be able to predict user behavior more accurately. The purpose of this research is to compare and contrast three widely-used data-clustering methods. Clustering techniques include partitioning, hierarchy, density, grid, and fuzzy clustering. Machine learning, data mining, pattern recognition, image analysis, and bioinformatics are just a few of the many fields where clustering is utilized as an analytical technique. In addition to defining the various algorithms, specialized forms of cluster analysis, linking methods, and please offer a review of the clustering techniques used in the big data setting

International Journal on Recent and Innovation Trends in Computing and Communication

Clustering Algorithms: Their Application to Gene Expression Data

Author: Agrawal R.
Alizadeh A.A.
Bandyopadhyay S.
Bandyopadhyay S.
Bezdek J.C.
Bezdek J.C.
Bezdek† J.C.
Bhargavi M.S.
Blatt M.
Bochkov Y.A.
Brunet J.P.
Bryan K.
Buitinck L.
Bunnik E.M.
Caliński T.
Chandrasekhar T.
Cheng Y.
Costa I.G.
Cover T.M.
D'haeseleer P.
Dave R.N.
Davies D.L.
De Morsier F.
Dempster A.P.
Dharmarajan A.
Dhillon I.S.
Divina F.
Do C.B.
Domany E.
Du Z.
Dunn† J.C.
Edla D.R.
Eisen M.B.
Ferguson T.S.
Frey B.J.
Fu L.
Fukuyama Y.
Galluccio L.
Gath I.
Getz G.
Gordon G.J.
Gu J.
Guha S.
Handhayani T.
Handl J.
Hatamlou A.
Heard N.A.
Heyer L.J.
Hinneburg A.
Hinneburg A.
Hu X.
Hubert L.J.
Jain A.K.
Jiang D.
Jiang H.
Joopudi S.
Kao Y.T.
Karmilasari S.W.
Karypis G.
Kaufman L.
Kerr G.
Kluger Y.
Kohonen T.
Kohonen T.
Krzanowski W.J.
Leone M.
Lu Y.
Lu Y.
Ma'sum M.A.
MacQueen J.
Madeira S.C.
Mann A.K.
Masciari E.
Maulik U.
Milligan G.W.
Mitra S.
Moon T.K.
Moore W.C.
Müllner D.
Nagpal A.
Nasser S.
Neal R.M.
Ng R.T.
Pakhira M.K.
Pal N.R.
Pedregosa F.
Pirim H.
Pitman J.
Prelić A.
Qin Z.S.
Raman S.
Rasmussen C.E.
Rezaee B.
Rezaee M.R.
Ruspini E.H.
Saha S.
Saha S.
Saha S.
Sathishkumar K.
Sheikholeslami G.
Sheng Q.
Sirinukunwattana K.
Sokal R.R.
Sun J.
Talaat A.M.
Tamayo P.
Tanay A.
Tang C.
Thalamuthu A.
Tibshirani R.
Wan M.
Wang L.
Wang W.
Williams G.
Wu J.
Wu K.L.
Wu S.
Xie X.L.
Xu R.
Xu Y.
Yu H.
Zhang D.
Zhang T.
Zhang Y.
Zhang Z.Y.
Zhao L.
Zhong C.
Zitnik M.
Řehůřek R.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure

Covenant University Repository

Crossref

Directory of Open Access Journals

PubMed Central

Boundary Extraction in Images Using Hierarchical Clustering-based Segmentation

Author: Selvan Arul
Publication venue
Publication date
Field of study

Hierarchical organization is one of the main characteristics of human segmentation. A human subject segments a natural image by identifying physical objects and marking their boundaries up to a certain level of detail [1]. Hierarchical clustering based segmentation (HCS) process mimics this capability of the human vision. The HCS process automatically generates a hierarchy of segmented images. The hierarchy represents the continuous merging of similar, spatially adjacent or disjoint, regions as the allowable threshold value of dissimilarity between regions, for merging, is gradually increased. HCS process is unsupervised and is completely data driven. This ensures that the segmentation process can be applied to any image, without any prior information about the image data and without any need for prior training of the segmentation process with the relevant image data. The implementation details of HCS process have been described elsewhere in the author's work [2]. The purpose of the current study is to demonstrate the performance of the HCS process in outlining boundaries in images and its possible application in processing medical images. [1] P. Arbelaez. Boundary Extraction in Natural Images Using Ultrametric Contour Maps. Proceedings 5th IEEE Workshop on Perceptual Organization in Computer Vision (POCV'06). June 2006. New York, USA. [2] A. N. Selvan. Highlighting Dissimilarity in Medical Images Using Hierarchical Clustering Based Segmentation (HCS). M. Phil. dissertation, Faculty of Arts Computing Engineering and Sciences Sheffield Hallam Univ., Sheffield, UK, 2007.</p

Sheffield Hallam University Research Archive

Recommended from our members

An Overview of the Use of Neural Networks for Data Mining Tasks

Author: Alberts B
Alpaydin E
Ando T
Blake CL
Bramer MA
Castanheira LG
Han J
Lu H
Mitchell M
Ni X
Quinlan RJ
Rumelhart DE
Shafer JC
Shendure J
Simić D
Stahl F
Steinwart I
Surjandari I
Wei JS
Widrow B
Witten IH
Zaslavsky B
Zhang D
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

Central Archive at the University of Reading

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online

Comparative Analysis of Mice Protein Expression: Clustering and Classification Approach

Author: Andeswari Rachmadita
Mustapha Aida
Saringat Mohd Zainuri
Publication venue: 'Penerbit UTHM'
Publication date: 01/01/2018
Field of study

The mice protein expression dataset was created to study the effect of learning between normal and trisomic mice or mice with Down Syndrome (DS). The extra copy of a normal chromosome in DS is believed to be the cause that alters the normal pathways and normal responses to stimulation, causing learning and memory deficits. This research attempts to analyze the protein expression dataset on protein influences that could have affected the recovering ability to learn among the trisomic mice. Two data mining tasks are employed; clustering and classification analysis. Clustering analysis via K-Means, Hierarchical Clustering, and Decision Tree have been proven useful to identify common critical protein responses, which in turn helping in identifying potentially more effective drug targets. Meanwhile, all classification models including the k-Nearest Neighbor, Random Forest, and Naive Bayes have efficiently classifies protein samples into the given eight classes with very high accuracy

UTHM Institutional Repository

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

International Journal of Integrated Engineering

PartSOM: A Framework for Distributed Data Clustering Using SOM and K-Means

Author: Flavius L. Gorgonio
Jose Alfredo F. Costa
Publication venue: 'IntechOpen'
Publication date: 01/04/2010
Field of study

IntechOpen

Crossref