4,147 research outputs found

    Parametric versus nonparametric: the fitness coefficient

    Full text link
    The fitness coefficient, introduced in this paper, results from a competition between parametric and nonparametric density estimators within the likelihood of the data. As illustrated on several real datasets, the fitness coefficient generally agrees with p-values but is easier to compute and interpret. Namely, the fitness coefficient can be interpreted as the proportion of data coming from the parametric model. Moreover, the fitness coefficient can be used to build a semiparamteric compromise which improves inference over the parametric and nonparametric approaches. From a theoretical perspective, the fitness coefficient is shown to converge in probability to one if the model is true and to zero if the model is false. From a practical perspective, the utility of the fitness coefficient is illustrated on real and simulated datasets

    On the acceleration of some empirical means with application to nonparametric regression

    Full text link
    Let (X1,…,Xn)(X_1,\ldots ,X_n) be an i.i.d. sequence of random variables in Rd\R^d, d≥1d\geq 1, for some function φ:RdR˚\varphi:\R^d\r \R, under regularity conditions, we show that \begin{align*} n^{1/2} \left(n^{-1} \sum_{i=1}^n \frac{\varphi(X_i)}{\w f^{(i)}(X_i)}-\int_{} \varphi(x)dx \right) \overset{\P}{\lr} 0, \end{align*} where \w f^{(i)} is the classical leave-one-out kernel estimator of the density of X1X_1. This result is striking because it speeds up traditional rates, in root nn, derived from the central limit theorem when \w f^{(i)}=f. As a consequence, it improves the classical Monte Carlo procedure for integral approximation. The paper mainly addressed with theoretical issues related to the later result (rates of convergence, bandwidth choice, regularity of φ\varphi) but also interests some statistical applications dealing with random design regression. In particular, we provide the asymptotic normality of the estimation of the linear functionals of a regression function on which the only requirement is the H\"older regularity. This leads us to a new version of the \textit{average derivative estimator} introduced by H\"ardle and Stoker in \cite{hardle1989} which allows for \textit{dimension reduction} by estimating the \textit{index space} of a regression

    Bootstrap Testing of the Rank of a Matrix via Least Squared Constrained Estimation

    Full text link
    In order to test if an unknown matrix has a given rank (null hypothesis), we consider the family of statistics that are minimum squared distances between an estimator and the manifold of fixed-rank matrix. Under the null hypothesis, every statistic of this family converges to a weighted chi-squared distribution. In this paper, we introduce the constrained bootstrap to build bootstrap estimate of the law under the null hypothesis of such statistics. As a result, the constrained bootstrap is employed to estimate the quantile for testing the rank. We provide the consistency of the procedure and the simulations shed light one the accuracy of the constrained bootstrap with respect to the traditional asymptotic comparison. More generally, the results are extended to test if an unknown parameter belongs to a sub-manifold locally smooth. Finally, the constrained bootstrap is easy to compute, it handles a large family of tests and it works under mild assumptions

    Integral approximation by kernel smoothing

    Full text link
    Let (X1,…,Xn)(X_1,\ldots,X_n) be an i.i.d. sequence of random variables in Rd\mathbb{R}^d, d≥1d\geq 1. We show that, for any function φ:Rd→R\varphi :\mathbb{R}^d\rightarrow\mathbb{R}, under regularity conditions, n1/2(n−1∑i=1nφ(Xi)f^(Xi)−∫φ(x) dx)⟶P0,n^ {1/2}\Biggl(n^{-1}\sum_{i=1}^n\frac{\varphi(X_i)}{\widehat{f}^(X_i)}- \int \varphi(x)\,dx\Biggr)\stackrel{\mathbb{P}}{\longrightarrow}0, where f^\widehat{f} is the classical kernel estimator of the density of X1X_1. This result is striking because it speeds up traditional rates, in root nn, derived from the central limit theorem when f^=f\widehat{f}=f. Although this paper highlights some applications, we mainly address theoretical issues related to the later result. We derive upper bounds for the rate of convergence in probability. These bounds depend on the regularity of the functions φ\varphi and ff, the dimension dd and the bandwidth of the kernel estimator f^\widehat{f}. Moreover, they are shown to be accurate since they are used as renormalizing sequences in two central limit theorems each reflecting different degrees of smoothness of φ\varphi. As an application to regression modelling with random design, we provide the asymptotic normality of the estimation of the linear functionals of a regression function. As a consequence of the above result, the asymptotic variance does not depend on the regression function. Finally, we debate the choice of the bandwidth for integral approximation and we highlight the good behavior of our procedure through simulations.Comment: Published at http://dx.doi.org/10.3150/15-BEJ725 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm). arXiv admin note: text overlap with arXiv:1312.449
    • …
    corecore