112,335 research outputs found
Improved Smoothed Analysis of the k-Means Method
The k-means method is a widely used clustering algorithm. One of its
distinguished features is its speed in practice. Its worst-case running-time,
however, is exponential, leaving a gap between practical and theoretical
performance. Arthur and Vassilvitskii (FOCS 2006) aimed at closing this gap,
and they proved a bound of \poly(n^k, \sigma^{-1}) on the smoothed
running-time of the k-means method, where n is the number of data points and
is the standard deviation of the Gaussian perturbation. This bound,
though better than the worst-case bound, is still much larger than the
running-time observed in practice.
We improve the smoothed analysis of the k-means method by showing two upper
bounds on the expected running-time of k-means. First, we prove that the
expected running-time is bounded by a polynomial in and
. Second, we prove an upper bound of k^{kd} \cdot \poly(n,
\sigma^{-1}), where d is the dimension of the data space. The polynomial is
independent of k and d, and we obtain a polynomial bound for the expected
running-time for .
Finally, we show that k-means runs in smoothed polynomial time for
one-dimensional instances.Comment: To be presented at the 20th ACM-SIAM Symposium on Discrete Algorithms
(SODA 2009
Towards explaining the speed of -means
The -means method is a popular algorithm for clustering, known for its speed in practice. This stands in contrast to its exponential worst-case running-time. To explain the speed of the -means method, a smoothed analysis has been conducted. We sketch this smoothed analysis and a generalization to Bregman divergences
High dynamic global positioning system receiver
A Global Positioning System (GPS) receiver having a number of channels, receives an aggregate of pseudorange code time division modulated signals. The aggregate is converted to baseband and then to digital form for separate processing in the separate channels. A fast fourier transform processor computes the signal energy as a function of Doppler frequency for each correlation lag, and a range and frequency estimator computes estimates of pseudorange, and frequency. Raw estimates from all channels are used to estimate receiver position, velocity, clock offset and clock rate offset in a conventional navigation and control unit, and based on the unit that computes smoothed estimates for the next measurement interval
Sharpening up Galactic all-sky maps with complementary data - A machine learning approach
Galactic all-sky maps at very disparate frequencies, like in the radio and
-ray regime, show similar morphological structures. This mutual
information reflects the imprint of the various physical components of the
interstellar medium. We want to use multifrequency all-sky observations to test
resolution improvement and restoration of unobserved areas for maps in certain
frequency ranges. For this we aim to reconstruct or predict from sets of other
maps all-sky maps that, in their original form, lack a high resolution compared
to other available all-sky surveys or are incomplete in their spatial coverage.
Additionally, we want to investigate the commonalities and differences that the
ISM components exhibit over the electromagnetic spectrum. We build an
-dimensional representation of the joint pixel-brightness distribution of
maps using a Gaussian mixture model and see how predictive it is: How well
can one map be reproduced based on subsets of other maps? Tests with mock data
show that reconstructing the map of a certain frequency from other frequency
regimes works astonishingly well, predicting reliably small-scale details well
below the spatial resolution of the initially learned map. Applied to the
observed multifrequency data sets of the Milky Way this technique is able to
improve the resolution of, e.g., the low-resolution Fermi LAT maps as well as
to recover the sky from artifact-contaminated data like the ROSAT 0.855 keV
map. The predicted maps generally show less imaging artifacts compared to the
original ones. A comparison of predicted and original maps highlights
surprising structures, imaging artifacts (fortunately not reproduced in the
prediction), and features genuine to the respective frequency range that are
not present at other frequency bands. We discuss limitations of this machine
learning approach and ideas how to overcome them
Reconstruction of the Dark Energy equation of state from latest data: the impact of theoretical priors
We reconstruct the Equation of State of Dark Energy (EoS) from current data
using a non-parametric approach where, rather than assuming a specific time
evolution of this function, we bin it in time. We treat the transition between
the bins with two different methods, i.e. a smoothed step function and a
Gaussian Process reconstruction, investigating whether or not the two
approaches lead to compatible results. Additionally, we include in the
reconstruction procedure a correlation between the values of the EoS at
different times in the form of a theoretical prior that takes into account a
set of viability and stability requirements that one can impose on models
alternative to CDM. In such case, we necessarily specialize to broad,
but specific classes of alternative models, i.e. Quintessence and Horndeski
gravity. We use data coming from CMB, Supernovae and BAO surveys. We find an
overall agreement between the different reconstruction methods used; with both
approaches, we find a time dependence of the mean of the reconstruction, with
different trends depending on the class of model studied. The constant EoS
predicted by the CDM model falls anyway within the bounds of
our analysis.Comment: 17 pages, 5 figures. Prepared for submission to JCA
- …