Search CORE

32 research outputs found

Noisy Independent Factor Analysis Model for Density Estimation and Classification

Author: Amato U.
Antoniadis A.
Samarov A.
Tsybakov A.B.
Publication venue: Cambridge, MA; Alfred P. Sloan School of Management, Massachusetts Institute of Technology
Publication date: 09/06/2009
Field of study

We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate (log1/4 n)/√n, independent of the dimensionality of the data, as the sample size n tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.Financial support from the IAP research network of the Belgian government (Belgian Federal Science Policy) is gratefully acknowledged. Research of A. Samarov was partially supported by NSF grant DMS- 0505561 and by a grant from Singapore-MIT Alliance (CSB). Research of A.B. Tsybakov was partially supported by the grant ANR-06-BLAN-0194 and by the PASCAL Network of Excellence

DSpace@MIT

A Smirnov-Bickel-Rosenblatt theorem for compactly-supported wavelets

Author: A. Cohen
A.B. Tsybakov
A.D. Bull
Adam D. Bull
E. Giné
E. Giné
F. Chyzak
I. Daubechies
J. Hüsler
J. Hüsler
L.D. Brown
N.V. Smirnov
O. Rioul
P.J. Bickel
W. Härdle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/04/2012
Field of study

In nonparametric statistical problems, we wish to find an estimator of an unknown function f. We can split its error into bias and variance terms; Smirnov, Bickel and Rosenblatt have shown that, for a histogram or kernel estimate, the supremum norm of the variance term is asymptotically distributed as a Gumbel random variable. In the following, we prove a version of this result for estimators using compactly-supported wavelets, a popular tool in nonparametric statistics. Our result relies on an assumption on the nature of the wavelet, which must be verified by provably-good numerical approximations. We verify our assumption for Daubechies wavelets and symlets, with N = 6, ..., 20 vanishing moments; larger values of N, and other wavelet bases, are easily checked, and we conjecture that our assumption holds also in those cases

arXiv.org e-Print Archive

Crossref

Uniform in bandwidth exact rates for a class of kernel estimators

Author: A.B. Owen
A.B. Tsybakov
A.W. Vaart Van der
D. Varron
D.M. Mason
Davit Varron
Ingrid Van Keilegom
M.B. Priestley
O. Bousquet
P. Massart
R.M. Clark
S.X. Chen
U. Einmahl
U. Einmahl
U. Einmahl
W. Härdle
Publication venue
Publication date: 01/01/2011
Field of study

Given an i.i.d sample

(Y_i,Z_i)

, taking values in \RRR^{d'}\times \RRR^d, we consider a collection Nadarya-Watson kernel estimators of the conditional expectations \EEE(+d_g(z)\mid Z=z), where

z

belongs to a compact set H\subset \RRR^d,

g

a Borel function on \RRR^{d'} and

c_g(\cdot),d_g(\cdot)

are continuous functions on \RRR^d. Given two bandwidth sequences h_n<\wth_n fulfilling mild conditions, we obtain an exact and explicit almost sure limit bounds for the deviations of these estimators around their expectations, uniformly in g\in\GG,\;z\in H and h_n\le h\le \wth_n under mild conditions on the density

f_Z

, the class \GG, the kernel

K

and the functions

c_g(\cdot),d_g(\cdot)

. We apply this result to prove that smoothed empirical likelihood can be used to build confidence intervals for conditional probabilities \PPP(Y\in C\mid Z=z), that hold uniformly in z\in H,\; C\in \CC,\; h\in [h_n,\wth_n]. Here \CC is a Vapnik-Chervonenkis class of sets.Comment: Published in the Annals of the Institute of Statistical Mathematics Volume 63, p. 1077-1102 (2011

arXiv.org e-Print Archive

HAL-uB

HAL - Université de Franche-Comté

Crossref

Research Papers in Economics

DIAL UCLouvain

Adaptive Density Estimation on the Circle by Nearly-Tight Frames

Author: A. Al-Sharadqah
A. Mayeli
A.B. Tsybakov
A.W. Vaart van der
B.W. Silverman
C. Durastanti
C. Durastanti
C. Durastanti
C. Durastanti
C. Durastanti
D. Donoho
D. Donoho
D. Geller
D. Geller
D. Geller
D. Geller
D. Geller
D. Marinucci
E. Stein
F.J. Narcowich
F.J. Narcowich
H. Wu
J. Klemela
M. Abramowitz
M. Marzio Di
N.I. Fisher
P. Baldi
S. Kato
S. Rao Jammalamadaka
S. Scodeller
W. Hardle
X. Lan
Publication venue
Publication date: 15/03/2016
Field of study

This work is concerned with the study of asymptotic properties of nonparametric density estimates in the framework of circular data. The estimation procedure here applied is based on wavelet thresholding methods: the wavelets used are the so-called Mexican needlets, which describe a nearly-tight frame on the circle. We study the asymptotic behaviour of the

L^{2}

-risk function for these estimates, in particular its adaptivity, proving that its rate of convergence is nearly optimal.Comment: 30 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Tight Lower Bound for Linear Sketches of Moments

Author: A.B. Tsybakov
I. Csiszár
L. Cam Le
L.D. Brown
N. Alon
P. Indyk
S. Ganguly
T.T. Cai
Y.I. Ingster
Z. Bar-Yossef
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The problem of estimating frequency moments of a data stream has attracted a lot of attention since the onset of streaming algorithms [AMS99]. While the space complexity for approximately computing the

p^{\rm th}

moment, for

p\in(0,2]

has been settled [KNW10], for

p>2

the exact complexity remains open. For

p>2

the current best algorithm uses

O(n^{1-2/p}\log n)

words of space [AKO11,BO10], whereas the lower bound is of

\Omega(n^{1-2/p})

[BJKS04]. In this paper, we show a tight lower bound of

\Omega(n^{1-2/p}\log n)

words for the class of algorithms based on linear sketches, which store only a sketch

Ax

of input vector

x

and some (possibly randomized) matrix

A

. We note that all known algorithms for this problem are linear sketches.Comment: In Proceedings of the 40th International Colloquium on Automata, Languages and Programming (ICALP), Riga, Latvia, July 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Agrégation d'estimateurs et optimisation stochastique

Author: Tsybakov A.B.
Publication venue: Société Française de Statistique et Société Mathématique de France
Publication date: 01/01/2008
Field of study

HAL Descartes

Introduction à l'estimation non-paramétrique

Author: Tsybakov A.B.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

Collection : Mathématiques & Applications n°41

Hal-Diderot

Aggregation by exponential weighting and sharp oracle inequalities

Author: Dalalyan A.
Tsybakov A.B.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2007
Field of study

Hal-Diderot

Improved Matrix Uncertainty selector

Author: Rosenbaum M.
Tsybakov A.B.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

Hal-Diderot