1,483,793 research outputs found
How Fast is the k-means Method
We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd’s method) for k-means clustering. Our upper bounds are polynomial in the number of points, number of clusters, and the spread of the point set. We also present a lower bound, showing that in the worst case the k-means heuristic needs to perform Ω(n) iterations, for n points on the real line and two centers. Surprisingly, the spread of the point set in this construction is polynomial. This is the first construction showing that the k-means heuristic requires more than a polylogarithmic number of iterations. Furthermore, we present two alternative algorithms, with guaranteed performance, which are simple variants of the k-means method. Results of our experimental studies on these algorithms are also presented.
-MLE: A fast algorithm for learning statistical mixture models
We describe -MLE, a fast and efficient local search algorithm for learning
finite statistical mixtures of exponential families such as Gaussian mixture
models. Mixture models are traditionally learned using the
expectation-maximization (EM) soft clustering technique that monotonically
increases the incomplete (expected complete) likelihood. Given prescribed
mixture weights, the hard clustering -MLE algorithm iteratively assigns data
to the most likely weighted component and update the component models using
Maximum Likelihood Estimators (MLEs). Using the duality between exponential
families and Bregman divergences, we prove that the local convergence of the
complete likelihood of -MLE follows directly from the convergence of a dual
additively weighted Bregman hard clustering. The inner loop of -MLE can be
implemented using any -means heuristic like the celebrated Lloyd's batched
or Hartigan's greedy swap updates. We then show how to update the mixture
weights by minimizing a cross-entropy criterion that implies to update weights
by taking the relative proportion of cluster points, and reiterate the mixture
parameter update and mixture weight update processes until convergence. Hard EM
is interpreted as a special case of -MLE when both the component update and
the weight update are performed successively in the inner loop. To initialize
-MLE, we propose -MLE++, a careful initialization of -MLE guaranteeing
probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201
Mumford dendrograms and discrete p-adic symmetries
In this article, we present an effective encoding of dendrograms by embedding
them into the Bruhat-Tits trees associated to -adic number fields. As an
application, we show how strings over a finite alphabet can be encoded in
cyclotomic extensions of and discuss -adic DNA encoding. The
application leads to fast -adic agglomerative hierarchic algorithms similar
to the ones recently used e.g. by A. Khrennikov and others. From the viewpoint
of -adic geometry, to encode a dendrogram in a -adic field means
to fix a set of -rational punctures on the -adic projective line
. To is associated in a natural way a
subtree inside the Bruhat-Tits tree which recovers , a method first used by
F. Kato in 1999 in the classification of discrete subgroups of
.
Next, we show how the -adic moduli space of
with punctures can be applied to the study of time series of
dendrograms and those symmetries arising from hyperbolic actions on
. In this way, we can associate to certain classes of dynamical
systems a Mumford curve, i.e. a -adic algebraic curve with totally
degenerate reduction modulo .
Finally, we indicate some of our results in the study of general discrete
actions on , and their relation to -adic Hurwitz spaces.Comment: 14 pages, 6 figure
Combination of fast hybrid classification and k value optimization in k-nn for video face recognition
Nowadays, the need for face recognition is no longer include images only but also videos. However, there are some challenges associated with the addition of this new technique such as how to determine the right pre-processing, feature extraction, and classification methods to obtain excellent performance. Although nowadays the k-Nearest Neighbor (k-NN) is widely used, high computational costs due to numerous features of the dataset and large amount of training data makes adequate processing difficult. Several studies have been conducted to improve the performance of k-NN using the FHC (Fast Hybrid Classification) method by optimizing the local k values. One of the disadvantages of the FHC Method is that the k value used is still in the default form. Therefore, this research proposes the use of k-NN value optimization methods in FHC, thereby, increasing its accuracy. The Fast Hybrid Classification which combines the k-means clustering with k-NN, groups the training data into several prototypes called TLDS (Two Level Data Structure). Furthermore, two classification levels are applied to label test data, with the first used to determine the n number of prototypes with the same class in the test data. The second classification using the optimized k value in the k-NN method, is employed to sharpen the accuracy, when the same number of prototypes does not reach n. The evaluation results show that this method provides 86% accuracy and time performance of 3.3 seconds
A Xenopus oocyte model system to study action potentials
Action potentials (APs) are the functional units of fast electrical signaling in excitable cells. The upstroke and downstroke of an AP is generated by the competing and asynchronous action of Na+- and K+-selective voltage-gated conductances. Although a mixture of voltage-gated channels has been long recognized to contribute to the generation and temporal characteristics of the AP, understanding how each of these proteins function and are regulated during electrical signaling remains the subject of intense research. AP properties vary among different cellular types because of the expression diversity, subcellular location, and modulation of ion channels. These complexities, in addition to the functional coupling of these proteins by membrane potential, make it challenging to understand the roles of different channels in initiating and temporally shaping the AP. Here, to address this problem, we focus our efforts on finding conditions that allow reliable AP recordings from Xenopus laevis oocytes coexpressing Na+ and K+ channels. As a proof of principle, we show how the expression of a variety of K+ channel subtypes can modulate excitability in this minimal model system. This approach raises the prospect of studies on the modulation of APs by pharmacological or biological means with a controlled background of Na+ and K+ channel expression
- …