Search CORE

19 research outputs found

Fast rates for noisy clustering

Author: Pack Kaelbling
Sébastien Loustau
Publication venue
Publication date: 01/01/2012
Field of study

The effect of errors in variables in empirical minimization is investigated. Given a loss

l

and a set of decision rules

\mathcal{G}

, we prove a general upper bound for an empirical minimization based on a deconvolution kernel and a noisy sample

Z_i=X_i+\epsilon_i,i=1,...,n

. We apply this general upper bound to give the rate of convergence for the expected excess risk in noisy clustering. A recent bound from \citet{levrard} proves that this rate is

\mathcal{O}(1/n)

in the direct case, under Pollard's regularity assumptions. Here the effect of noisy measurements gives a rate of the form

\mathcal{O}(1/n^{\frac{\gamma}{\gamma+2\beta}})

, where

\gamma

is the H\"older regularity of the density of

X

whereas

\beta

is the degree of illposedness

arXiv.org e-Print Archive

CiteSeerX

Fast rates for empirical vector quantization

Author: Clément Levrard
Hal Id Hal
Publication venue
Publication date: 01/01/2011
Field of study

We consider the rate of convergence of the expected loss of empirically optimal vector quantizers. Earlier results show that the mean-squared expected distortion for any fixed distribution supported on a bounded set and satisfying some regularity conditions decreases at the rate O(log n/n). We prove that this rate is actually O(1/n). Although these conditions are hard to check, we show that well-polarized distributions with continuous densities supported on a bounded set are included in the scope of this result.Comment: 18 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Anisotropic oracle inequalities in noisy quantization

Author: Loustau Sébastien
Publication venue
Publication date: 26/04/2013
Field of study

The effect of errors in variables in quantization is investigated. We prove general exact and non-exact oracle inequalities with fast rates for an empirical minimization based on a noisy sample

Z_i=X_i+\epsilon_i,i=1,\ldots,n

, where

X_i

are i.i.d. with density

f

and

\epsilon_i

are i.i.d. with density

\eta

. These rates depend on the geometry of the density

f

and the asymptotic behaviour of the characteristic function of

\eta

. This general study can be applied to the problem of

k

-means clustering with noisy data. For this purpose, we introduce a deconvolution

k

-means stochastic minimization which reaches fast rates of convergence under standard Pollard's regularity assumptions.Comment: 30 pages. arXiv admin note: text overlap with arXiv:1205.141

arXiv.org e-Print Archive

Convergence and Rates for Fixed-Interval Multiple-Track Smoothing Using $k$ -Means Type Optimization

Author: Johansen Adam M.
Thorpe Matthew
Publication venue
Publication date: 01/01/2016
Field of study

We address the task of estimating multiple trajectories from unlabeled data. This problem arises in many settings, one could think of the construction of maps of transport networks from passive observation of travellers, or the reconstruction of the behaviour of uncooperative vehicles from external observations, for example. There are two coupled problems. The first is a data association problem: how to map data points onto individual trajectories. The second is, given a solution to the data association problem, to estimate those trajectories. We construct estimators as a solution to a regularized variational problem (to which approximate solutions can be obtained via the simple, efficient and widespread

k

-means method) and show that, as the number of data points,

n

, increases, these estimators exhibit stable behaviour. More precisely, we show that they converge in an appropriate Sobolev space in probability and with rate

n^{-1/2}

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

The University of Manchester - Institutional Repository

Convergence of the $k$ -Means Minimization Problem using $\Gamma$ -Convergence

Author: Cade Neil
Johansen Adam M.
Theil Florian
Thorpe Matthew
Publication venue
Publication date: 01/01/2015
Field of study

The

k

-means method is an iterative clustering algorithm which associates each observation with one of

k

clusters. It traditionally employs cluster centers in the same space as the observed data. By relaxing this requirement, it is possible to apply the

k

-means method to infinite dimensional problems, for example multiple target tracking and smoothing problems in the presence of unknown data association. Via a

\Gamma

-convergence argument, the associated optimization problem is shown to converge in the sense that both the

k

-means minimum and minimizers converge in the large data limit to quantities which depend upon the observed data only through its distribution. The theory is supplemented with two examples to demonstrate the range of problems now accessible by the

k

-means method. The first example combines a non-parametric smoothing problem with unknown data association. The second addresses tracking using sparse data from a network of passive sensors

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Universal multiresolution source codes

Author: Effros Michelle
Publication venue
Publication date: 01/09/2001
Field of study

A multiresolution source code is a single code giving an embedded source description that can be read at a variety of rates and thereby yields reproductions at a variety of resolutions. The resolution of a source reproduction here refers to the accuracy with which it approximates the original source. Thus, a reproduction with low distortion is a “high-resolution” reproduction while a reproduction with high distortion is a “low-resolution” reproduction. This paper treats the generalization of universal lossy source coding from single-resolution source codes to multiresolution source codes. Results described in this work include new definitions for weakly minimax universal, strongly minimax universal, and weighted universal sequences of fixed- and variable-rate multiresolution source codes that extend the corresponding notions from lossless coding and (single-resolution) quantization to multiresolution quantizers. A variety of universal multiresolution source coding results follow, including necessary and sufficient conditions for the existence of universal multiresolution codes, rate of convergence bounds for universal multiresolution coding performance to the theoretical bound, and a new multiresolution approach to two-stage universal source coding