Search CORE

1,483,524 research outputs found

How Fast is the k-means Method

Author: Bardia Sadri
Sariel Har-peled
Publication venue
Publication date
Field of study

We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd’s method) for k-means clustering. Our upper bounds are polynomial in the number of points, number of clusters, and the spread of the point set. We also present a lower bound, showing that in the worst case the k-means heuristic needs to perform Ω(n) iterations, for n points on the real line and two centers. Surprisingly, the spread of the point set in this construction is polynomial. This is the first construction showing that the k-means heuristic requires more than a polylogarithmic number of iterations. Furthermore, we present two alternative algorithms, with guaranteed performance, which are simple variants of the k-means method. Results of our experimental studies on these algorithms are also presented.

CiteSeerX

$k$ -MLE: A fast algorithm for learning statistical mixture models

Author: Nielsen Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2012
Field of study

We describe

k

-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering

k

-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of

k

-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of

k

-MLE can be implemented using any

k

-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of

k

-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize

k

-MLE, we propose

k

-MLE++, a careful initialization of

k

-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

arXiv.org e-Print Archive

Crossref

Mumford dendrograms and discrete p-adic symmetries

Author: A. Yu. Khrennikov
B. Dragovich
D. Mumford
F. Kato
F. Murtagh
F. Murtagh
J. Benois-Pineau
J. Tate
L. O. Chekhov
P. E. Bradley
P. E. Bradley
P. E. Bradley
Publication venue: 'Pleiades Publishing Ltd'
Publication date: 09/09/2008
Field of study

In this article, we present an effective encoding of dendrograms by embedding them into the Bruhat-Tits trees associated to

p

-adic number fields. As an application, we show how strings over a finite alphabet can be encoded in cyclotomic extensions of

\mathbb{Q}_p

and discuss

p

-adic DNA encoding. The application leads to fast

p

-adic agglomerative hierarchic algorithms similar to the ones recently used e.g. by A. Khrennikov and others. From the viewpoint of

p

-adic geometry, to encode a dendrogram

X

in a

p

-adic field

K

means to fix a set

S

K

-rational punctures on the

p

-adic projective line

\mathbb{P}^1

. To

\mathbb{P}^1\setminus S

is associated in a natural way a subtree inside the Bruhat-Tits tree which recovers

X

, a method first used by F. Kato in 1999 in the classification of discrete subgroups of

\textrm{PGL}_2(K)

. Next, we show how the

p

-adic moduli space

\mathfrak{M}_{0,n}

\mathbb{P}^1

with

n

punctures can be applied to the study of time series of dendrograms and those symmetries arising from hyperbolic actions on

\mathbb{P}^1

. In this way, we can associate to certain classes of dynamical systems a Mumford curve, i.e. a

p

-adic algebraic curve with totally degenerate reduction modulo

p

. Finally, we indicate some of our results in the study of general discrete actions on

\mathbb{P}^1

, and their relation to

p

-adic Hurwitz spaces.Comment: 14 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Combination of fast hybrid classification and k value optimization in k-nn for video face recognition

Author: Septiana Nuning
Suciati Nanik
Publication venue: 'Universitas Pesantren Tinggi Darul Ulum (Unipdu)'
Publication date: 06/04/2020
Field of study

Nowadays, the need for face recognition is no longer include images only but also videos. However, there are some challenges associated with the addition of this new technique such as how to determine the right pre-processing, feature extraction, and classification methods to obtain excellent performance. Although nowadays the k-Nearest Neighbor (k-NN) is widely used, high computational costs due to numerous features of the dataset and large amount of training data makes adequate processing difficult. Several studies have been conducted to improve the performance of k-NN using the FHC (Fast Hybrid Classification) method by optimizing the local k values. One of the disadvantages of the FHC Method is that the k value used is still in the default form. Therefore, this research proposes the use of k-NN value optimization methods in FHC, thereby, increasing its accuracy. The Fast Hybrid Classification which combines the k-means clustering with k-NN, groups the training data into several prototypes called TLDS (Two Level Data Structure). Furthermore, two classification levels are applied to label test data, with the first used to determine the n number of prototypes with the same class in the test data. The second classification using the optimized k value in the k-NN method, is employed to sharpen the accuracy, when the same number of prototypes does not reach n. The evaluation results show that this method provides 86% accuracy and time performance of 3.3 seconds

Jurnal Online Unipdu Jombang (Universitas Pesantren Tinggi Darul 'Ulum)

A Xenopus oocyte model system to study action potentials

Author: Boland Linda M
Corbin-Leftwich Aaron
Robinson Helen H
Small Hannah E
Villalba-Galea Carlos A.
Publication venue: Scholarly Commons
Publication date: 05/11/2018
Field of study

Action potentials (APs) are the functional units of fast electrical signaling in excitable cells. The upstroke and downstroke of an AP is generated by the competing and asynchronous action of Na+- and K+-selective voltage-gated conductances. Although a mixture of voltage-gated channels has been long recognized to contribute to the generation and temporal characteristics of the AP, understanding how each of these proteins function and are regulated during electrical signaling remains the subject of intense research. AP properties vary among different cellular types because of the expression diversity, subcellular location, and modulation of ion channels. These complexities, in addition to the functional coupling of these proteins by membrane potential, make it challenging to understand the roles of different channels in initiating and temporally shaping the AP. Here, to address this problem, we focus our efforts on finding conditions that allow reliable AP recordings from Xenopus laevis oocytes coexpressing Na+ and K+ channels. As a proof of principle, we show how the expression of a variety of K+ channel subtypes can modulate excitability in this minimal model system. This approach raises the prospect of studies on the modulation of APs by pharmacological or biological means with a controlled background of Na+ and K+ channel expression

Crossref

Pacific McGeorge School of Law

Scholarly Commons