8,797 research outputs found

    A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table

    A comparative analysis of nature-inspired optimization approaches to 2d geometric modelling for turbomachinery applications

    Get PDF
    A vast variety of population-based optimization techniques have been formulated in recent years for use in different engineering applications, most of which are inspired by natural processes taking place in our environment. However, the mathematical and statistical analysis of these algorithms is still lacking. This paper addresses a comparative performance analysis on some of the most important nature-inspired optimization algorithms with a different basis for the complex high-dimensional curve/surface fitting problems. As a case study, the point cloud of an in-hand gas turbine compressor blade measured by touch trigger probes is optimally fitted using B-spline curves. In order to determine the optimum number/location of a set of Bezier/NURBS control points for all segments of the airfoil profiles, five dissimilar population-based evolutionary and swarm optimization techniques are employed. To comprehensively peruse and to fairly compare the obtained results, parametric and nonparametric statistical evaluations as the mathematical study are presented before designing an experiment. Results illuminate a number of advantages/disadvantages of each optimization method for such complex geometries’ parameterization from several different points of view. In terms of application, the final appropriate parametric representation of geometries is an essential, significant component of aerodynamic profile optimization processes as well as reverse engineering purposes

    Testing for Homogeneity in Mixture Models

    Full text link
    Statistical models of unobserved heterogeneity are typically formalized as mixtures of simple parametric models and interest naturally focuses on testing for homogeneity versus general mixture alternatives. Many tests of this type can be interpreted as C(α)C(\alpha) tests, as in Neyman (1959), and shown to be locally, asymptotically optimal. These C(α)C(\alpha) tests will be contrasted with a new approach to likelihood ratio testing for general mixture models. The latter tests are based on estimation of general nonparametric mixing distribution with the Kiefer and Wolfowitz (1956) maximum likelihood estimator. Recent developments in convex optimization have dramatically improved upon earlier EM methods for computation of these estimators, and recent results on the large sample behavior of likelihood ratios involving such estimators yield a tractable form of asymptotic inference. Improvement in computation efficiency also facilitates the use of a bootstrap methods to determine critical values that are shown to work better than the asymptotic critical values in finite samples. Consistency of the bootstrap procedure is also formally established. We compare performance of the two approaches identifying circumstances in which each is preferred

    Genetic learning particle swarm optimization

    Get PDF
    Social learning in particle swarm optimization (PSO) helps collective efficiency, whereas individual reproduction in genetic algorithm (GA) facilitates global effectiveness. This observation recently leads to hybridizing PSO with GA for performance enhancement. However, existing work uses a mechanistic parallel superposition and research has shown that construction of superior exemplars in PSO is more effective. Hence, this paper first develops a new framework so as to organically hybridize PSO with another optimization technique for “learning.” This leads to a generalized “learning PSO” paradigm, the *L-PSO. The paradigm is composed of two cascading layers, the first for exemplar generation and the second for particle updates as per a normal PSO algorithm. Using genetic evolution to breed promising exemplars for PSO, a specific novel *L-PSO algorithm is proposed in the paper, termed genetic learning PSO (GL-PSO). In particular, genetic operators are used to generate exemplars from which particles learn and, in turn, historical search information of particles provides guidance to the evolution of the exemplars. By performing crossover, mutation, and selection on the historical information of particles, the constructed exemplars are not only well diversified, but also high qualified. Under such guidance, the global search ability and search efficiency of PSO are both enhanced. The proposed GL-PSO is tested on 42 benchmark functions widely adopted in the literature. Experimental results verify the effectiveness, efficiency, robustness, and scalability of the GL-PSO

    Multi-Factor Policy Evaluation and Selection in the One-Sample Situation

    Get PDF
    Firms nowadays need to make decisions with fast information obsolesce. In this paper I deal with one class of decision problems in this situation, called the “one-sample” problems: we have finite options and one sample of the multiple criteria with which we use to evaluate those options. I develop evaluation procedures based on bootstrapping DEA (Data Envelopment Envelopment) and the related decision-making methods. This paper improves the bootstrap procedure proposed by Simar and Wilson (1998) and shows how to exploit information from bootstrap outputs for decision-making

    Statistical Methods for Convergence Detection of Multi-Objective Evolutionary Algorithms

    Get PDF
    In this paper, two approaches for estimating the generation in which a multi-objective evolutionary algorithm (MOEA) shows statistically significant signs of convergence are introduced. A set-based perspective is taken where convergence is measured by performance indicators. The proposed techniques fulfill the requirements of proper statistical assessment on the one hand and efficient optimisation for real-world problems on the other hand. The first approach accounts for the stochastic nature of the MOEA by repeating the optimisation runs for increasing generation numbers and analysing the performance indicators using statistical tools. This technique results in a very robust offline procedure. Moreover, an online convergence detection method is introduced as well. This method automatically stops the MOEA when either the variance of the performance indicators falls below a specified threshold or a stagnation of their overall trend is detected. Both methods are analysed and compared for two MOEA and on different classes of benchmark functions. It is shown that the methods successfully operate on all stated problems needing less function evaluations while preserving good approximation duality at the same time.Article / Letter to editorLeiden Inst. Advanced Computer Science
    corecore