Search CORE

58,264 research outputs found

A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Author: Al Hasan
Al-Daoud
Aloise
Aloise
Anderberg
Babu
Babu
Ball
Bei
Bergmann
Bottou
Breunig
Cao
Celebi
Chen
Chen
Daniel
Forgy
Friedman
Garcia
Garcia
Gonzalez
Hartigan
Hassan A. Kingravi
Hotelling
Huang
Huang
Hubert
Hyvärinen
Iman
Jain
Jain
Jancey
Kanungo
Katsavounidis
Kaufman
Lance
Likas
Linde
Lloyd
Lu
Luengo
M. Emre Celebi
Maitra
Mao
Matsumoto
Meilă
Milligan
Milligan
Norušis
Onoda
Ordonez
Pal
Patricio A. Vela
Pena
Redmond
Selim
Späth
Su
Tarsitano
Tou
Wu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 10/09/2012
Field of study

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table

arXiv.org e-Print Archive

Crossref

One-step preparation of cluster states in quantum dot molecules

Author: Guo Guang-Can
Guo Guo-Ping
Tu Tao
Zhang Hui
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2007
Field of study

Cluster states, a special type of highly entangled states, are a universal resource for measurement-based quantum computation. Here, we propose an efficient one-step generation scheme for cluster states in semiconductor quantum dot molecules, where qubits are encoded on singlet and triplet state of two coupled quantum dots. By applying a collective electrical field or simultaneously adjusting interdot bias voltages of all double-dot molecule, we get a switchable Ising-like interaction between any two adjacent quantum molecule qubits. The initialization, the single qubit measurement, and the experimental parameters are discussed, which shows the large cluster state preparation and one-way quantum computation implementable in semiconductor quantum dots with the present techniques.Comment: 5 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Quantum computing with antiferromagnetic spin clusters

We show that a wide range of spin clusters with antiferromagnetic intracluster exchange interaction allows one to define a qubit. For these spin cluster qubits, initialization, quantum gate operation, and readout are possible using the same techniques as for single spins. Quantum gate operation for the spin cluster qubit does not require control over the intracluster exchange interaction. Electric and magnetic fields necessary to effect quantum gates need only be controlled on the length scale of the spin cluster rather than the scale for a single spin. Here, we calculate the energy gap separating the logical qubit states from the next excited state and the matrix elements which determine quantum gate operation times. We discuss spin cluster qubits formed by one- and two-dimensional arrays of s=1/2 spins as well as clusters formed by spins s>1/2. We illustrate the advantages of spin cluster qubits for various suggested implementations of spin qubits and analyze the scaling of decoherence time with spin cluster size.Comment: 15 pages, 7 figures; minor change

arXiv.org e-Print Archive

Crossref

edoc

Modelling and Verification of a Cluster-tree Formation Protocol Implementation for the IEEE 802.15.4 TSCH MAC Operation Mode

Author: Dandelski Conrad
Groote Jan Friso
Talebi Mahmoud
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2017
Field of study

Correct and efficient initialization of wireless sensor networks can be challenging in the face of many uncertainties present in ad hoc wireless networks. In this paper we examine an implementation for the formation of a cluster-tree topology in a network which operates on top of the TSCH MAC operation mode of the IEEE 802.15.4 standard, and investigate it using formal methods. We show how both the mCRL2 language and toolset help us in identifying scenarios where the implementation does not form a proper topology. More importantly, our analysis leads to the conclusion that the cluster-tree formation algorithm has a super linear time complexity. So, it does not scale to large networks.Comment: In Proceedings MARS 2017, arXiv:1703.0581

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

SWORD (Cork Inst. of Technology)

Robust EM algorithm for model-based curve clustering

Author: Chamroukhi Faicel
Publication venue
Publication date: 25/12/2013
Field of study

Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful approaches in cluster analysis. The mixture density estimation is generally performed by maximizing the observed-data log-likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the EM algorithm initialization is crucial. In addition, the standard EM algorithm requires the number of clusters to be known a priori. Some solutions have been provided in [31, 12] for model-based clustering with Gaussian mixture models for multivariate data. In this paper we focus on model-based curve clustering approaches, when the data are curves rather than vectorial data, based on regression mixtures. We propose a new robust EM algorithm for clustering curves. We extend the model-based clustering approach presented in [31] for Gaussian mixture models, to the case of curve clustering by regression mixtures, including polynomial regression mixtures as well as spline or B-spline regressions mixtures. Our approach both handles the problem of initialization and the one of choosing the optimal number of clusters as the EM learning proceeds, rather than in a two-fold scheme. This is achieved by optimizing a penalized log-likelihood criterion. A simulation study confirms the potential benefit of the proposed algorithm in terms of robustness regarding initialization and funding the actual number of clusters.Comment: In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN), 2013, Dallas, TX, US

arXiv.org e-Print Archive

Crossref

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.196

arXiv.org e-Print Archive

Crossref

An Arithmetic-Based Deterministic Centroid Initialization Method for the k-Means Clustering Algorithm

Author: Mayo Matthew Michael
Publication venue: CSU ePress
Publication date: 01/05/2016
Field of study

One of the greatest challenges in k-means clustering is positioning the initial cluster centers, or centroids, as close to optimal as possible, and doing so in an amount of time deemed reasonable. Traditional fc-means utilizes a randomization process for initializing these centroids, and poor initialization can lead to increased numbers of required clustering iterations to reach convergence, and a greater overall runtime. This research proposes a simple, arithmetic-based deterministic centroid initialization method which is much faster than randomized initialization. Preliminary experiments suggest that this collection of methods, referred to herein as the sharding centroid initialization algorithm family, often outperforms random initialization in terms of the required number of iterations for convergence and overall time-related metrics and is competitive or better in terms of the reported mean sum of squared errors (SSE) metric. Surprisingly, the sharding algorithms often manage to report more advantageous mean SSE values in the instances where their performance is slower than random initialization

Columbus State University