Search CORE

198 research outputs found

Optimal Rates of Statistical Seriation

Author: Flammarion Nicolas
Mao Cheng
Rigollet Philippe
Publication venue
Publication date: 01/08/2016
Field of study

Given a matrix the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.Comment: V2 corrects an error in Lemma A.1, v3 corrects appendix F on unimodal regression where the bounds now hold with polynomial probability rather than exponentia

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

General nonexact oracle inequalities for classes with a subexponential envelope

Author: Lecué Guillaume
Mendelson Shahar
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 05/06/2012
Field of study

We show that empirical risk minimization procedures and regularized empirical risk minimization procedures satisfy nonexact oracle inequalities in an unbounded framework, under the assumption that the class has a subexponential envelope function. The main novelty, in addition to the boundedness assumption free setup, is that those inequalities can yield fast rates even in situations in which exact oracle inequalities only hold with slower rates. We apply these results to show that procedures based on

\ell_1

and nuclear norms regularization functions satisfy oracle inequalities with a residual term that decreases like

1/n

for every

L_q

-loss functions (

q\geq2

), while only assuming that the tail behavior of the input and output variables are well behaved. In particular, no RIP type of assumption or "incoherence condition" are needed to obtain fast residual terms in those setups. We also apply these results to the problems of convex aggregation and model selection.Comment: Published in at http://dx.doi.org/10.1214/11-AOS965 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

The Australian National University

HAL - UPEC / UPEM

Sparse Regression Learning by Aggregation and Langevin Monte-Carlo

Author: Dalalyan Arnak
Tsybakov Alexandre B.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PAC-Bayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented bound is valid whenever the temperature parameter

\beta

of the EWA is larger than or equal to

4\sigma^2

, where

\sigma^2

is the noise variance. A remarkable feature of this result is that it is valid even for unbounded regression functions and the choice of the temperature parameter depends exclusively on the noise level. Next, we apply this general bound to the problem of aggregating the elements of a finite-dimensional linear space spanned by a dictionary of functions

\phi_1,...,\phi_M

. We allow

M

to be much larger than the sample size

n

but we assume that the true regression function can be well approximated by a sparse linear combination of functions

\phi_j

. Under this sparsity scenario, we propose an EWA with a heavy tailed prior and we show that it satisfies a sparsity oracle inequality with leading constant one. Finally, we propose several Langevin Monte-Carlo algorithms to approximately compute such an EWA when the number

M

of aggregated functions can be large. We discuss in some detail the convergence of these algorithms and present numerical experiments that confirm our theoretical findings.Comment: Short version published in COLT 200

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique

On the optimality of the empirical risk minimization procedure for the convex aggregation problem

Author: Lecué Guillaume
Mendelson Shahar
Publication venue
Publication date: 01/01/2013
Field of study

Numérisation de Documents Anciens Mathématiques

Model averaging: A shrinkage perspective

Author: Peng Jingfu
Publication venue
Publication date: 25/09/2023
Field of study

Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotone non-increasing weights in a Gaussian sequence model. The Mallows MA, which estimates weights by minimizing the Mallows'

C_p

, is a variation of the positive-part Stein estimator. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. Our resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper

arXiv.org e-Print Archive