134 research outputs found

    Testing the suitability of polynomial models in errors-in-variables problems

    Get PDF
    A low-degree polynomial model for a response curve is used commonly in practice. It generally incorporates a linear or quadratic function of the covariate. In this paper we suggest methods for testing the goodness of fit of a general polynomial model when there are errors in the covariates. There, the true covariates are not directly observed, and conventional bootstrap methods for testing are not applicable. We develop a new approach, in which deconvolution methods are used to estimate the distribution of the covariates under the null hypothesis, and a ``wild'' or moment-matching bootstrap argument is employed to estimate the distribution of the experimental errors (distinct from the distribution of the errors in covariates). Most of our attention is directed at the case where the distribution of the errors in covariates is known, although we also discuss methods for estimation and testing when the covariate error distribution is estimated. No assumptions are made about the distribution of experimental error, and, in particular, we depart substantially from conventional parametric models for errors-in-variables problems.Comment: Published in at http://dx.doi.org/10.1214/009053607000000361 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Variable selection in measurement error models

    Full text link
    Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semi-parametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ205 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Optimal variance estimation without estimating the mean function

    Full text link
    We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared differences of these paired observations on the squared distances between paired covariates via a simple least squares procedure, the resulting variance estimator is not only asymptotically normal and root-nn consistent, but also reaches the optimal bound in terms of estimation variance. We also demonstrate the advantage of the least squares estimator in comparison with existing methods in terms of the second order asymptotic properties.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ432 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index

    Full text link
    We propose a generalized partially linear functional single index risk score model for repeatedly measured outcomes where the index itself is a function of time. We fuse the nonparametric kernel method and regression spline method, and modify the generalized estimating equation to facilitate estimation and inference. We use local smoothing kernel to estimate the unspecified coefficient functions of time, and use B-splines to estimate the unspecified function of the single index component. The covariance structure is taken into account via a working model, which provides valid estimation and inference procedure whether or not it captures the true covariance. The estimation method is applicable to both continuous and discrete outcomes. We derive large sample properties of the estimation procedure and show a different convergence rate for each component of the model. The asymptotic properties when the kernel and regression spline methods are combined in a nested fashion has not been studied prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Flexible estimation of a semiparametric two-component mixture model with one parametric component

    Get PDF
    We study a two-component semiparametric mixture model where one component distribution belongs to a parametric class, while the other is symmetric but otherwise arbitrary. This semiparametric model has wide applications in many areas such as large-scale simultaneous testing/multiple testing, sequential clustering, and robust modeling. We develop a class of estimators that are surprisingly simple and are unique in terms of their construction. A unique feature of these methods is that they do not rely on the estimation of the nonparametric component of the model. Instead, the methods only require a working model of the unspecified distribution, which may or may not reflect the true distribution. In addition, we establish connections between the existing estimator and the new methods and further derive a semiparametric efficient estimator. We compare our estimators with the existing method and investigate the advantages and cost of the relatively simple estimation procedure

    A SPLINE-ASSISTED SEMIPARAMETRIC APPROACH TO NONPARAMETRIC MEASUREMENT ERROR MODELS

    Get PDF
    Nonparametric estimation of the probability density function of a random variable measured with error is considered to be a difficult problem, in the sense that depending on the measurement error prop- erty, the estimation rate can be as slow as the logarithm of the sample size. Likewise, nonparametric estimation of the regression function with errors in the covariate suffers the same possibly slow rate. The traditional methods for both problems are based on deconvolution, where the slow convergence rate is caused by the quick convergence to zero of the Fourier transform of the measurement error density, which, unfortunately, appears in the denominators during the construction of these methods. Using a completely different approach of spline-assisted semiparametric methods, we are able to construct nonparametric estimators of both density functions and regression mean functions that achieve the same nonparametric convergence rate as in the error free case. Other than requiring the error-prone variable distribution to be compactly supported, our assumptions are not stronger than in the classical deconvolution literatures. The performance of these methods are demonstrated through some simulations and a data example
    • …
    corecore