1,483,524 research outputs found

    How Fast is the k-means Method

    Get PDF
    We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd’s method) for k-means clustering. Our upper bounds are polynomial in the number of points, number of clusters, and the spread of the point set. We also present a lower bound, showing that in the worst case the k-means heuristic needs to perform Ω(n) iterations, for n points on the real line and two centers. Surprisingly, the spread of the point set in this construction is polynomial. This is the first construction showing that the k-means heuristic requires more than a polylogarithmic number of iterations. Furthermore, we present two alternative algorithms, with guaranteed performance, which are simple variants of the k-means method. Results of our experimental studies on these algorithms are also presented.

    kk-MLE: A fast algorithm for learning statistical mixture models

    Full text link
    We describe kk-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering kk-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of kk-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of kk-MLE can be implemented using any kk-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of kk-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize kk-MLE, we propose kk-MLE++, a careful initialization of kk-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

    Mumford dendrograms and discrete p-adic symmetries

    Full text link
    In this article, we present an effective encoding of dendrograms by embedding them into the Bruhat-Tits trees associated to pp-adic number fields. As an application, we show how strings over a finite alphabet can be encoded in cyclotomic extensions of Qp\mathbb{Q}_p and discuss pp-adic DNA encoding. The application leads to fast pp-adic agglomerative hierarchic algorithms similar to the ones recently used e.g. by A. Khrennikov and others. From the viewpoint of pp-adic geometry, to encode a dendrogram XX in a pp-adic field KK means to fix a set SS of KK-rational punctures on the pp-adic projective line P1\mathbb{P}^1. To P1∖S\mathbb{P}^1\setminus S is associated in a natural way a subtree inside the Bruhat-Tits tree which recovers XX, a method first used by F. Kato in 1999 in the classification of discrete subgroups of PGL2(K)\textrm{PGL}_2(K). Next, we show how the pp-adic moduli space M0,n\mathfrak{M}_{0,n} of P1\mathbb{P}^1 with nn punctures can be applied to the study of time series of dendrograms and those symmetries arising from hyperbolic actions on P1\mathbb{P}^1. In this way, we can associate to certain classes of dynamical systems a Mumford curve, i.e. a pp-adic algebraic curve with totally degenerate reduction modulo pp. Finally, we indicate some of our results in the study of general discrete actions on P1\mathbb{P}^1, and their relation to pp-adic Hurwitz spaces.Comment: 14 pages, 6 figure

    Combination of fast hybrid classification and k value optimization in k-nn for video face recognition

    Get PDF
    Nowadays, the need for face recognition is no longer include images only but also videos. However, there are some challenges associated with the addition of this new technique such as how to determine the right pre-processing, feature extraction, and classification methods to obtain excellent performance. Although nowadays the k-Nearest Neighbor (k-NN) is widely used, high computational costs due to numerous features of the dataset and large amount of training data makes adequate processing difficult. Several studies have been conducted to improve the performance of k-NN using the FHC (Fast Hybrid Classification) method by optimizing the local k values. One of the disadvantages of the FHC Method is that the k value used is still in the default form. Therefore, this research proposes the use of k-NN value optimization methods in FHC, thereby, increasing its accuracy. The Fast Hybrid Classification which combines the k-means clustering with k-NN, groups the training data into several prototypes called TLDS (Two Level Data Structure). Furthermore, two classification levels are applied to label test data, with the first used to determine the n number of prototypes with the same class in the test data. The second classification using the optimized k value in the k-NN method, is employed to sharpen the accuracy, when the same number of prototypes does not reach n. The evaluation results show that this method provides 86% accuracy and time performance of 3.3 seconds

    A Xenopus oocyte model system to study action potentials

    Get PDF
    Action potentials (APs) are the functional units of fast electrical signaling in excitable cells. The upstroke and downstroke of an AP is generated by the competing and asynchronous action of Na+- and K+-selective voltage-gated conductances. Although a mixture of voltage-gated channels has been long recognized to contribute to the generation and temporal characteristics of the AP, understanding how each of these proteins function and are regulated during electrical signaling remains the subject of intense research. AP properties vary among different cellular types because of the expression diversity, subcellular location, and modulation of ion channels. These complexities, in addition to the functional coupling of these proteins by membrane potential, make it challenging to understand the roles of different channels in initiating and temporally shaping the AP. Here, to address this problem, we focus our efforts on finding conditions that allow reliable AP recordings from Xenopus laevis oocytes coexpressing Na+ and K+ channels. As a proof of principle, we show how the expression of a variety of K+ channel subtypes can modulate excitability in this minimal model system. This approach raises the prospect of studies on the modulation of APs by pharmacological or biological means with a controlled background of Na+ and K+ channel expression
    • …
    corecore