Search CORE

18 research outputs found

Sharp estimation in sup norm with random design

Author: Gaiffas Stéphane
Publication venue
Publication date: 01/01/2005
Field of study

The aim of this paper is to recover the regression function with sup norm loss. We construct an asymptotically sharp estimator which converges with the spatially dependent rate r\_{n, \mu}(x) = P \big(\log n / (n \mu(x)) \big)^{s / (2s + 1)}, where

\mu

is the design density,

s

the regression smoothness,

n

the sample size and

P

is a constant expressed in terms of a solution to a problem of optimal recovery as in Donoho (1994). We prove this result under the assumption that

\mu

is positive and continuous. This estimator combines kernel and local polynomial methods, where the kernel is given by optimal recovery, which allows to prove the result up to the constants for any

s > 0

. Moreover, the estimator does not depend on

\mu

. We prove that

r\_{n, \mu}(x)

is optimal in a sense which is stronger than the classical minimax lower bound. Then, an inhomogeneous confidence band is proposed. This band has a non constant length which depends on the local amount of data

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Convergence rates for pointwise curve estimation with a degenerate design

Author: Gaiffas Stéphane
Publication venue
Publication date: 13/01/2006
Field of study

The nonparametric regression with a random design model is considered. We want to recover the regression function at a point x where the design density is vanishing or exploding. Depending on assumptions on the regression function local regularity and on the design local behaviour, we find several minimax rates. These rates lie in a wide range, from slow l(n) rates where l(.) is slowly varying (for instance (log n)^(-1)) to fast n^(-1/2) * l(n) rates. If the continuity modulus of the regression function at x can be bounded from above by a s-regularly varying function, and if the design density is b-regularly varying, we prove that the minimax convergence rate at x is n^(-s/(1+2s+b)) * l(n)

arXiv.org e-Print Archive

Hal-Diderot

Link Prediction in Graphs with Autoregressive Features

Author: Gaiffas Stephane
Richard Emile
Vayatis Nicolas
Publication venue
Publication date: 14/09/2012
Field of study

In the paper, we consider the problem of link prediction in time-evolving graphs. We assume that certain graph features, such as the node degree, follow a vector autoregressive (VAR) model and we propose to use this information to improve the accuracy of prediction. Our strategy involves a joint optimization procedure over the space of adjacency matrices and VAR matrices which takes into account both sparsity and low rank properties of the matrices. Oracle inequalities are derived and illustrate the trade-offs in the choice of smoothing parameters when modeling the joint effect of sparsity and low rank property. The estimate is computed efficiently using proximal methods through a generalized forward-backward agorithm.Comment: NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Estimation for the Prediction of Point Processes with Many Covariates

Author: Alessio Sancetta
Andersen
Barron
Bauwens
Bradic
Brémaud
Brémaud
Brémaud
Burman
Bühlmann
Cont
Engle
Fan
Gaiffas
Hall
Hasbrouck
Katznelson
Lillo
Lorentz
Meinshausen
Nielsen
Ogata
Sancetta
Sancetta
Seillier-Moiseiwitsch
Tsybakov
van de
van der Vaart
van Dijk
Yukich
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 17/02/2017
Field of study

Estimation of the intensity of a point process is considered within a nonparametric framework. The intensity measure is unknown and depends on covariates, possibly many more than the observed number of jumps. Only a single trajectory of the counting process is observed. Interest lies in estimating the intensity conditional on the covariates. The impact of the covariates is modelled by an additive model where each component can be written as a linear combination of possibly unknown functions. The focus is on prediction as opposed to variable screening. Conditions are imposed on the coefficients of this linear combination in order to control the estimation error. The rates of convergence are optimal when the number of active covariates is large. As an application, the intensity of the buy and sell trades of the New Zealand dollar futures is estimated and a test for forecast evaluation is presented. A simulation is included to provide some finite sample intuition on the model and asymptotic properties

arXiv.org e-Print Archive

Crossref

Royal Holloway - Pure

Régression non-paramétrique et information spatialement inhomogène

Author: Gaiffas Stéphane
Publication venue: HAL CCSD
Publication date: 08/12/2005
Field of study

We study the nonparametric estimation of a signal based on inhomogeneous noisy data (the amount of data varies on the estimation domain). We consider the model of nonparametric regression with random design. Our aim is to understand the consequences of inhomogeneous data on the estimation problem in the minimax setup. Our approach is twofold: local and global. In the local setup, we want to recover the regression at a point with little, or much data. By translating this property into several assumptions on the design density, we obtain a large range of new minimax rates, containing very slow and very fast rates. Then, we construct a smoothness adaptive procedure, and we show that it converges with aminimax rate penalised by a minimal cost. In the global setup, we wantto recover the regression with sup norm loss. We propose estimatorsconverging with rates which are sensitive to the inhomogeneousbehaviour of the information in the model. We prove the spatialoptimality of these rates, which consists in an enforcement of theclassical minimax lower bound for sup norm loss. In particular, weconstruct an asymptotically sharp estimator over Hölder balls withany smoothness, and a confidence band with a width which adapts to thelocal amount of data.Nous étudions l'estimation non-paramétrique d'un signal à partir dedonnées bruitées spatialement inhomogènes (données dont la quantitévarie sur le domaine d'estimation). Le prototype d'étude est le modèlede régression avec design aléatoire. Notre objectif est de comprendreles conséquences du caractère inhomogène des données sur le problèmed'estimation dans le cadre d'étude minimax. Nous adoptons deux pointsde vue : local et global. Du point de vue local, nous nous intéressonsà l'estimation de la régression en un point avec peu ou beaucoup dedonnées. En traduisant cette propriété par différentes hypothèses surle comportement local de la densité du design, nous obtenons toute unegamme de nouvelles vitesses minimax ponctuelles, comprenant desvitesses très lentes et des vitesses très rapides. Puis, nousconstruisons une procédure adaptative en la régularité de larégression, et nous montrons qu'elle converge avec la vitesse minimaxà laquelle s'ajoute un coût minimal pour l'adaptation locale. Du pointde vue global, nous nous intéressons à l'estimation de la régressionen perte uniforme. Nous proposons des estimateurs qui convergent avecdes vitesses dépendantes de l'espace, lesquelles rendent compte ducaractère inhomogène de l'information dans le modèle. Nous montronsl'optimalité spatiale de ces vitesses, qui consiste en un renforcementde la borne inférieure minimax classique pour la perte uniforme. Nousconstruisons notamment un estimateur asymptotiquement exact sur uneboule de Hölder de régularité quelconque, ainsi qu'une bande deconfiance dont la largeur s'adapte à la quantité locale de données

Thèses en Ligne

Hal-Diderot

High dimensional matrix estimation with unknown variance of the noise

Author: Gaiffas Stéphane
Klopp Olga
Publication venue: Taipei : Institute of Statistical Science, Academia Sinica
Publication date: 01/01/2017
Field of study

International audienceWe propose a new pivotal method for estimating high-dimensional matrices. Assume that we observe a small set of entries or linear combinations of entries of an unknown matrix

A_0

corrupted by noise. We propose a new method for estimating

A_0

which does not rely on the knowledge or an estimation of the standard deviation of the noise

\sigma

. Our estimator achieves, up to a logarithmic factor, optimal rates of convergence under the Frobenius risk and, thus, has the same prediction performance as previously proposed estimators which rely on the knowledge of

\sigma

. Our method is based on the solution of a convex optimization problem which makes it computationally attractive

HAL-Polytechnique

Inégalités d'oracle exactes pour la prédiction d'une matrice en grande dimension

Author: Gaiffas Stéphane
Lecué Guillaume
Tsybakov Alexandre,
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceWe consider the problem of prediction of a high dimensional matrix of size

m \times T

with noise, meaning that

m T

is much larger than the sample size

n

. We focus on the trace norm minimization algorithm, but also on other penalizations. It is now well-known that such algorithms can be used for matrix completion, as well as other problems, such as multi-task learning, see \cite{candes-plan2,candes-recht08,candes-plan1,candes-tao1, rohde-tsyb09, MR2417263}. In this work, we propose sharp oracle inequalities in a statistical learning setup

HAL-Polytechnique

HAL - UPEC / UPEM