173 research outputs found

    On the acceleration of some empirical means with application to nonparametric regression

    Full text link
    Let (X1,…,Xn)(X_1,\ldots ,X_n) be an i.i.d. sequence of random variables in Rd\R^d, d≥1d\geq 1, for some function φ:RdR˚\varphi:\R^d\r \R, under regularity conditions, we show that \begin{align*} n^{1/2} \left(n^{-1} \sum_{i=1}^n \frac{\varphi(X_i)}{\w f^{(i)}(X_i)}-\int_{} \varphi(x)dx \right) \overset{\P}{\lr} 0, \end{align*} where \w f^{(i)} is the classical leave-one-out kernel estimator of the density of X1X_1. This result is striking because it speeds up traditional rates, in root nn, derived from the central limit theorem when \w f^{(i)}=f. As a consequence, it improves the classical Monte Carlo procedure for integral approximation. The paper mainly addressed with theoretical issues related to the later result (rates of convergence, bandwidth choice, regularity of φ\varphi) but also interests some statistical applications dealing with random design regression. In particular, we provide the asymptotic normality of the estimation of the linear functionals of a regression function on which the only requirement is the H\"older regularity. This leads us to a new version of the \textit{average derivative estimator} introduced by H\"ardle and Stoker in \cite{hardle1989} which allows for \textit{dimension reduction} by estimating the \textit{index space} of a regression

    Parametric versus nonparametric: the fitness coefficient

    Full text link
    The fitness coefficient, introduced in this paper, results from a competition between parametric and nonparametric density estimators within the likelihood of the data. As illustrated on several real datasets, the fitness coefficient generally agrees with p-values but is easier to compute and interpret. Namely, the fitness coefficient can be interpreted as the proportion of data coming from the parametric model. Moreover, the fitness coefficient can be used to build a semiparamteric compromise which improves inference over the parametric and nonparametric approaches. From a theoretical perspective, the fitness coefficient is shown to converge in probability to one if the model is true and to zero if the model is false. From a practical perspective, the utility of the fitness coefficient is illustrated on real and simulated datasets

    Bootstrap Testing of the Rank of a Matrix via Least Squared Constrained Estimation

    Full text link
    In order to test if an unknown matrix has a given rank (null hypothesis), we consider the family of statistics that are minimum squared distances between an estimator and the manifold of fixed-rank matrix. Under the null hypothesis, every statistic of this family converges to a weighted chi-squared distribution. In this paper, we introduce the constrained bootstrap to build bootstrap estimate of the law under the null hypothesis of such statistics. As a result, the constrained bootstrap is employed to estimate the quantile for testing the rank. We provide the consistency of the procedure and the simulations shed light one the accuracy of the constrained bootstrap with respect to the traditional asymptotic comparison. More generally, the results are extended to test if an unknown parameter belongs to a sub-manifold locally smooth. Finally, the constrained bootstrap is easy to compute, it handles a large family of tests and it works under mild assumptions

    Integral approximation by kernel smoothing

    Full text link
    Let (X1,…,Xn)(X_1,\ldots,X_n) be an i.i.d. sequence of random variables in Rd\mathbb{R}^d, d≥1d\geq 1. We show that, for any function φ:Rd→R\varphi :\mathbb{R}^d\rightarrow\mathbb{R}, under regularity conditions, n1/2(n−1∑i=1nφ(Xi)f^(Xi)−∫φ(x) dx)⟶P0,n^ {1/2}\Biggl(n^{-1}\sum_{i=1}^n\frac{\varphi(X_i)}{\widehat{f}^(X_i)}- \int \varphi(x)\,dx\Biggr)\stackrel{\mathbb{P}}{\longrightarrow}0, where f^\widehat{f} is the classical kernel estimator of the density of X1X_1. This result is striking because it speeds up traditional rates, in root nn, derived from the central limit theorem when f^=f\widehat{f}=f. Although this paper highlights some applications, we mainly address theoretical issues related to the later result. We derive upper bounds for the rate of convergence in probability. These bounds depend on the regularity of the functions φ\varphi and ff, the dimension dd and the bandwidth of the kernel estimator f^\widehat{f}. Moreover, they are shown to be accurate since they are used as renormalizing sequences in two central limit theorems each reflecting different degrees of smoothness of φ\varphi. As an application to regression modelling with random design, we provide the asymptotic normality of the estimation of the linear functionals of a regression function. As a consequence of the above result, the asymptotic variance does not depend on the regression function. Finally, we debate the choice of the bandwidth for integral approximation and we highlight the good behavior of our procedure through simulations.Comment: Published at http://dx.doi.org/10.3150/15-BEJ725 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm). arXiv admin note: text overlap with arXiv:1312.449

    Integral estimation based on Markovian design

    Get PDF
    Suppose that a mobile sensor describes a Markovian trajectory in the ambient space. At each time the sensor measures an attribute of interest, e.g., the temperature. Using only the location history of the sensor and the associated measurements, the aim is to estimate the average value of the attribute over the space. In contrast to classical probabilistic integration methods, e.g., Monte Carlo, the proposed approach does not require any knowledge on the distribution of the sensor trajectory. Probabilistic bounds on the convergence rates of the estimator are established. These rates are better than the traditional "root n"-rate, where n is the sample size, attached to other probabilistic integration methods. For finite sample sizes, the good behaviour of the procedure is demonstrated through simulations and an application to the evaluation of the average temperature of oceans is considered.Comment: 45 page

    Test function: A new approach for covering the central subspace

    Full text link
    In this paper we offer a complete methodology for sufficient dimension reduction called the test function (TF). TF provides a new family of methods for the estimation of the central subspace (CS) based on the introduction of a nonlinear transformation of the response. Theoretical background of TF is developed under weaker conditions than the existing methods. By considering order 1 and 2 conditional moments of the predictor given the response, we divide TF in two classes. In each class we provide conditions that guarantee an exhaustive estimation of the CS. Besides, the optimal members are calculated via the minimization of the asymptotic mean squared error deriving from the distance between the CS and its estimate. This leads us to two plug-in methods which are evaluated with several simulations
    • …
    corecore