Search CORE

8,797 research outputs found

A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Author: Al Hasan
Al-Daoud
Aloise
Aloise
Anderberg
Babu
Babu
Ball
Bei
Bergmann
Bottou
Breunig
Cao
Celebi
Chen
Chen
Daniel
Forgy
Friedman
Garcia
Garcia
Gonzalez
Hartigan
Hassan A. Kingravi
Hotelling
Huang
Huang
Hubert
Hyvärinen
Iman
Jain
Jain
Jancey
Kanungo
Katsavounidis
Kaufman
Lance
Likas
Linde
Lloyd
Lu
Luengo
M. Emre Celebi
Maitra
Mao
Matsumoto
Meilă
Milligan
Milligan
Norušis
Onoda
Ordonez
Pal
Patricio A. Vela
Pena
Redmond
Selim
Späth
Su
Tarsitano
Tou
Wu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 10/09/2012
Field of study

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table

arXiv.org e-Print Archive

A comparative analysis of nature-inspired optimization approaches to 2d geometric modelling for turbomachinery applications

Author: Assadi Mohsen
Jafari Soheil
Lemu Hirpa G.
Safari Amir
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

A vast variety of population-based optimization techniques have been formulated in recent years for use in different engineering applications, most of which are inspired by natural processes taking place in our environment. However, the mathematical and statistical analysis of these algorithms is still lacking. This paper addresses a comparative performance analysis on some of the most important nature-inspired optimization algorithms with a different basis for the complex high-dimensional curve/surface fitting problems. As a case study, the point cloud of an in-hand gas turbine compressor blade measured by touch trigger probes is optimally fitted using B-spline curves. In order to determine the optimum number/location of a set of Bezier/NURBS control points for all segments of the airfoil profiles, five dissimilar population-based evolutionary and swarm optimization techniques are employed. To comprehensively peruse and to fairly compare the obtained results, parametric and nonparametric statistical evaluations as the mathematical study are presented before designing an experiment. Results illuminate a number of advantages/disadvantages of each optimization method for such complex geometries’ parameterization from several different points of view. In terms of application, the final appropriate parametric representation of geometries is an essential, significant component of aerodynamic profile optimization processes as well as reverse engineering purposes

Directory of Open Access Journals

Testing for Homogeneity in Mixture Models

Author: Gu Jiaying
Koenker Roger
Volgushev Stanislav
Publication venue
Publication date: 21/03/2016
Field of study

Statistical models of unobserved heterogeneity are typically formalized as mixtures of simple parametric models and interest naturally focuses on testing for homogeneity versus general mixture alternatives. Many tests of this type can be interpreted as

C(\alpha)

tests, as in Neyman (1959), and shown to be locally, asymptotically optimal. These

C(\alpha)

tests will be contrasted with a new approach to likelihood ratio testing for general mixture models. The latter tests are based on estimation of general nonparametric mixing distribution with the Kiefer and Wolfowitz (1956) maximum likelihood estimator. Recent developments in convex optimization have dramatically improved upon earlier EM methods for computation of these estimators, and recent results on the large sample behavior of likelihood ratios involving such estimators yield a tractable form of asymptotic inference. Improvement in computation efficiency also facilitates the use of a bootstrap methods to determine critical values that are shown to work better than the asymptotic critical values in finite samples. Consistency of the bootstrap procedure is also formally established. We compare performance of the two approaches identifying circumstances in which each is preferred

arXiv.org e-Print Archive

CiteSeerX

Genetic learning particle swarm optimization

Author: Chung Henry Shu-Hung
Gong Yue-Jiao
Li Jing-Jing
Li Yun
Shi Yu-Hui
Zhang Jun
Zhou Yicong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Social learning in particle swarm optimization (PSO) helps collective efficiency, whereas individual reproduction in genetic algorithm (GA) facilitates global effectiveness. This observation recently leads to hybridizing PSO with GA for performance enhancement. However, existing work uses a mechanistic parallel superposition and research has shown that construction of superior exemplars in PSO is more effective. Hence, this paper first develops a new framework so as to organically hybridize PSO with another optimization technique for “learning.” This leads to a generalized “learning PSO” paradigm, the *L-PSO. The paradigm is composed of two cascading layers, the first for exemplar generation and the second for particle updates as per a normal PSO algorithm. Using genetic evolution to breed promising exemplars for PSO, a specific novel *L-PSO algorithm is proposed in the paper, termed genetic learning PSO (GL-PSO). In particular, genetic operators are used to generate exemplars from which particles learn and, in turn, historical search information of particles provides guidance to the evolution of the exemplars. By performing crossover, mutation, and selection on the historical information of particles, the constructed exemplars are not only well diversified, but also high qualified. Under such guidance, the global search ability and search efficiency of PSO are both enhanced. The proposed GL-PSO is tested on 42 benchmark functions widely adopted in the literature. Experimental results verify the effectiveness, efficiency, robustness, and scalability of the GL-PSO

Enlighten

Multi-Factor Policy Evaluation and Selection in the One-Sample Situation

Author: Chen C.M. (Chien-Ming)
Publication venue: Chen, C.M. (Chien-Ming)
Publication date: 01/01/2008
Field of study

Firms nowadays need to make decisions with fast information obsolesce. In this paper I deal with one class of decision problems in this situation, called the “one-sample” problems: we have finite options and one sample of the multiple criteria with which we use to evaluate those options. I develop evaluation procedures based on bootstrapping DEA (Data Envelopment Envelopment) and the related decision-making methods. This paper improves the bootstrap procedure proposed by Simar and Wilson (1998) and shows how to exploit information from bootstrap outputs for decision-making

CiteSeerX

Erasmus University Digital Repository

Statistical Methods for Convergence Detection of Multi-Objective Evolutionary Algorithms

Author: Mehnen J.
Naujoks B.
Preuss M.
Trautmann H.
Wagner T.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2009
Field of study

In this paper, two approaches for estimating the generation in which a multi-objective evolutionary algorithm (MOEA) shows statistically significant signs of convergence are introduced. A set-based perspective is taken where convergence is measured by performance indicators. The proposed techniques fulfill the requirements of proper statistical assessment on the one hand and efficient optimisation for real-world problems on the other hand. The first approach accounts for the stochastic nature of the MOEA by repeating the optimisation runs for increasing generation numbers and analysing the performance indicators using statistical tools. This technique results in a very robust offline procedure. Moreover, an online convergence detection method is introduced as well. This method automatically stops the MOEA when either the variance of the performance indicators falls below a specified threshold or a stagnation of their overall trend is detected. Both methods are analysed and compared for two MOEA and on different classes of benchmark functions. It is shown that the methods successfully operate on all stated problems needing less function evaluations while preserving good approximation duality at the same time.Article / Letter to editorLeiden Inst. Advanced Computer Science