Search CORE

284,267 research outputs found

Understanding stakeholder values using cluster analysis

Author: Kaval Pamela
Publication venue: Waikato Management School
Publication date: 01/08/2007
Field of study

The K-Means and Ward’s Clustering procedures were used to categorize value similarities among respondents of a public land management survey. The clustering procedures resulted in two respondent groupings: an anthropocentrically focused group and an ecocentrically focused group. While previous studies have suggested that anthropocentric and ecocentric groups are very different, this study revealed many similarities. Similarities between groups included a strong feeling towards public land and national forest existence as well as the importance of considering both current and future generations when making management decisions for public land. It is recommended that land managers take these similarities into account when making management decisions. It is important to note that using the Ward’s procedure for clustering produced more consistent groupings than the K-Means procedure and is therefore recommended when clustering survey data. K-Means only showed consistency with datasets of over 500 observations

Research Commons@Waikato

A survey of kernel and spectral methods for clustering

Author: Aizerman
Aronszajn
Belkin
Bengio
Bezdek
Bishop
Burges
Camastra
Chan
Chen
Chiang
Cortes
Cristianini
Cristianini
Dhillon
Dhillon
Donath
Duda
Fiedler
Fisher
Francesco Camastra
Francesco Masulli
Gersho
Girolami
Golub
Have
Horn
Huber
Hur
Jain
Kernighan
Kluger
Kohonen
Kohonen
Krishnapuram
Krishnapuram
Kulis
Lee
Leski
Linde
Lloyd
Martinetz
Maurizio Filippone
Mercer
Müller
Ng
Ritter
Rose
Roth
Roweis
Saitoh
Schölkopf
Schölkopf
Shi
Sigillito
Sneath
Stefano Rovetta
Tax
Vapnik
von Luxburg
Ward
Weston
Wolberg
Xu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

CiteSeerX

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Crossref

Enlighten

Archivio istituzionale della ricerca - Università di Genova

White Rose Research Online

The k-means algorithm: A comprehensive survey and performance evaluation

Author: Ahmed Mohiuddin
Islam Syed Mohammed Shamsul
Seraj Raihan
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/08/2020
Field of study

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings. Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. The detailed experimental analysis along with a thorough comparison among different k-means clustering algorithms differentiates our work compared to other existing survey papers. Furthermore, it outlines a clear and thorough understanding of the k-means algorithm along with its different research directions

Research Online @ ECU

Clustering Methods and Their Applications to Adolescent Healthcare Data

Author: Mayer-Jochimsen Morgan
Publication venue: Scholarship @ Claremont
Publication date: 01/01/2013
Field of study

Clustering is a mathematical method of data analysis which identifies trends in data by efficiently separating data into a specified number of clusters so is incredibly useful and widely applicable for questions of interrelatedness of data. Two methods of clustering are considered here. K-means clustering defines clusters in relation to the centroid, or center, of a cluster. Spectral clustering establishes connections between all of the data points to be clustered, then eliminates those connections that link dissimilar points. This is represented as an eigenvector problem where the solution is given by the eigenvectors of the Normalized Graph Laplacian. Spectral clustering establishes groups so that the similarity between points of the same cluster is stronger than similarity between different clusters. K-means and spectral clustering are used to analyze adolescent data from the 2009 California Health Interview Survey. Differences were observed between the results of the clustering methods on 3294 individuals and 22 health-related attributes. K-means clustered the adolescents by exercise, poverty, and variables related to psychological health while spectral clustering groups were informed by smoking, alcohol use, low exercise, psychological distress, low parental involvement, and poverty. We posit some guesses as to this difference, observe characteristics of the clustering methods, and comment on the viability of spectral clustering on healthcare data

Scholarship@Claremont

A Survey on Performance Improvement of Data Analysis Using Unsupervised K-Means Clustering

Author: Kumari Arpana
Raghuvanshi Monika
Publication venue: 'Revista Mexicana de Biodiversidad'
Publication date: 25/10/2021
Field of study

The algorithms clustering implemented on the machines and made intelligent machines are called unsupervised machine learning algorithms. They can perform essential tasks by k-means clustering algorithm based on improved quantum particle swarm optimization algorithm is often more error in data analysis. As more data becomes available, more complex problems can be tackled and solved. The analysis of patient's data is becoming more critical to evaluate the patient's medical condition and prevent and take precautions for the future. With the help of technology and computerized automation of machines, data can be analyzed more efficiently. Managing the massive volume of data has many problems interrelated to data security. Experiments on actual datasets show that our technique will get similar results with standard ways with fewer computation tasks. Process mining and data mining techniques have opened new access for the diagnosis of disease. Similarly, data mining can provide effective treatment for a disease's triennial prevention; finally, an effective clustering result is obtained. The algorithm is tested with the UCI data set. The results show that the improved algorithm ensures the global convergence of the algorithm and brings more accurate clustering results

International Journal of Advanced Computer Technology

A survey on feature weighting based K-Means algorithms

Author: A GODER
A STURN
AK JAIN
AL BLUM
AP DEMPSTER
AP GASCH
B Mirkin
CY TSAI
D ALOISE
D Steinley
D STEINLEY
D STEINLEY
D WETTSCHERECK
DS MODHA
E Polak
F Murtagh
G Soete de
G Soete de
GH BALL
H Steinhaus
I GUYON
JC BEZDEK
L HUBERT
LA ZADEH
P DRINEAS
P MITRA
PE GREEN
R Bellman
R GNANADESIKAN
R KOHAVI
RC AMORIM DE
RC AMORIM DE
Renato Cordeiro de Amorim
SP CHATZIS
V MAKARENKOV
WS DESARBO
WS DESARBO
WS DESARBO
Z Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/08/2016
Field of study

This is a pre-copyedited, author-produced PDF of an article accepted for publication in Journal of Classification [de Amorim, R. C., 'A survey on feature weighting based K-Means algorithms', Journal of Classification, Vol. 33(2): 210-242, August 25, 2016]. Subject to embargo. Embargo end date: 25 August 2017. The final publication is available at Springer via http://dx.doi.org/10.1007/s00357-016-9208-4 © Classification Society of North America 2016In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper elaborates on the concept of feature weighting and addresses these issues by critically analysing some of the most popular, or innovative, feature weighting mechanisms based in K-Means.Peer reviewedFinal Accepted Versio

University of Essex Research Repository

Crossref

University of Hertfordshire Research Archive