2,513 research outputs found

    Variable selection in measurement error models

    Full text link
    Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semi-parametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ205 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Some Recent Advances in Measurement Error Models and Methods

    Get PDF
    A measurement error model is a regression model with (substantial) measurement errors in the variables. Disregarding these measurement errors in estimating the regression parameters results in asymptotically biased estimators. Several methods have been proposed to eliminate, or at least to reduce, this bias, and the relative efficiency and robustness of these methods have been compared. The paper gives an account of these endeavors. In another context, when data are of a categorical nature, classification errors play a similar role as measurement errors in continuous data. The paper also reviews some recent advances in this field

    Measurement error models for time series

    Get PDF
    Estimation for multivariate linear measurement error models with serially correlated observations is addressed;The asymptotic properties of some standard linear errors-in-variables regression parameter estimators are developed under an ultrastructural model in which the random components of the model follow a linear process. Under the same assumptions, the asymptotic properties of weighted method-of-moments estimators are derived. The large-sample results rest on the asymptotic properties of the sum of a linear function and a quadratic function of a sequence of serially correlated random vectors;Maximum likelihood estimation for the normal structural and functional models is addressed. For each model, first- and second-derivative matrices of the log-likelihood functions are given and Newton-Raphson maximum likelihood estimation procedures are considered. For the structural model, the assumption that the random components follow a multivariate autoregressive moving average process is used to develop autoregressive moving average and state-space models for the observation sequence. The state-space representation of the structural model leads to innovation sequences and associated derivative sequences that provide the basis for a Newton-Raphson procedure for the estimation of regression parameters and autocovariance parameters of the structural model. A modified state-space approach leads to a similar procedure for the estimation for the functional model. An extension of the state-space approach to maximum likelihood estimation for a structural model with combined time series and cross-sectional data is given

    Robust estimation in measurement error models

    Get PDF
    We consider the simple measurement error regression model y[subscript] t = [beta][subscript]0 + [beta][subscript]1x[subscript] t + q[subscript] t, (Y[subscript] t,X[subscript] t) = (y[subscript] t,x[subscript] t) + (w[subscript] t,u[subscript] t), where [beta] = ([beta][subscript]0, [beta][subscript]1) is the parameter of interest, (Y[subscript] t, X[subscript] t), t = 1, 2, ..., n, are the observations, (y[subscript] t, x[subscript] t) are the true vectors, (w[subscript] t, u[subscript] t) are measurement errors, and q[subscript] t is the equation error. We assume that the measurement errors a[subscript] t = (w[subscript] t, u[subscript] t), t = 1, 2, ..., n, are independent of (q[subscript] j, x[subscript] j) for all t and j and that q[subscript] j is independent of x[subscript] t for all t and j. It is also assumed that the covariance matrix of a[subscript] t is known;Extreme observations have an adverse effect on the usual estimators of the parameters. A class of estimators of [beta] is constructed in which the effect of extreme observations is reduced. Our estimation procedure is based on the robust regression of Y on X and the robust regression of X on Y. The asymptotic joint distribution of the robust estimators of the regression coefficients and error mean squares is obtained when the observations are sampled from a bivariate normal distribution. The robust estimator of [beta] is a smooth function of the estimated regression coefficients and the estimated error mean squares from the two robust regressions. We show that for normal distributions, under certain regularity conditions, our estimator is consistent and normally distributed in the limit. The robust estimation procedure is developed to be robust against a single outlier and then extended to be robust against multiple outliers. A Monte Carlo study is presented to show that our estimator is insensitive to outliers and that the efficiency loss is modest when there is no outlier in the sample

    Well-posedness of measurement error models for self-reported data

    Get PDF
    It is widely admitted that the inverse problem of estimating the distribution of a latent variable X* from an observed sample of X, a contaminated measurement of X*, is ill-posed. This paper shows that measurement error models for self-reporting data are well-posed, assuming the probability of reporting truthfully is nonzero, which is an observed property in validation studies. This optimistic result suggests that one should not ignore the point mass at zero in the error distribution when modeling measurement errors in self-reported data. We also illustrate that the classical measurement error models may in fact be conditionally well-posed given prior information on the distribution of the latent variable X*. By both a Monte Carlo study and an empirical application, we show that failing to account for the property can lead to significant bias on estimation of distribution of X*.

    Well-Posedness of Measurement Error Models for Self-Reported Data

    Get PDF
    It is widely admitted that the inverse problem of estimating the distribution of a latent variable X* from an observed sample of X, a contaminated measurement of X*, is ill-posed. This paper shows that a property of self-reporting errors, observed from validation studies, is that the probability of reporting the truth is nonzero conditional on the true values, and furthermore, this property implies that measurement error models for self-reporting data are in fact well-posed. We also illustrate that the classical measurement error models may in fact be conditionally well-posed given prior information on the distribution of the latent variable X*.

    Deconvolution Estimation in Measurement Error Models: The R Package decon

    Get PDF
    Data from many scientific areas often come with measurement error. Density or distribution function estimation from contaminated data and nonparametric regression with errors in variables are two important topics in measurement error models. In this paper, we present a new software package decon for R, which contains a collection of functions that use the deconvolution kernel methods to deal with the measurement error problems. The functions allow the errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in R, we adapt the fast Fourier transform algorithm for density estimation with error-free data to the deconvolution kernel estimation. We discuss the practical selection of the smoothing parameter in deconvolution methods and illustrate the use of the package through both simulated and real examples.

    Comparing the efficiency of structural and functional methods in measurement error models

    Get PDF
    The paper is a survey of recent investigations by the authors and others into the relative efficiencies of structural and functional estimators of the regression parameters in a measurement error model. While structural methods, in particular the quasi-score (QS) method, take advantage of the knowledge of the regressor distribution (if available), functional methods, in particular the corrected score (CS) method, discards such knowledge and works even if such knowledge is not available. Among other results, it has been shown that QS is more efficient than CS as long as the regressor distribution is completely known. However, if nuisance parameters in the regressor distribution have to be estimated, this is no more true in general. But by modifying the QS method, the adverse effect of the nuisance parameters can be overcome. For small measurement errors, the efficiencies of QS and CS become almost indistinguishable, whether nuisance parameters are present or not. QS is (asymptotically) biased if the regressor distribution has been misspecified, while CS is always consistent and thus more robust than QS

    Sequential regression measurement error models with application

    Get PDF
    Sequential regression approaches can be used to analyse processes in which covariates are revealed in stages. Such processes occur widely, with examples including medical intervention, sports contests, and political campaigns. The naïvenaive sequential approach involves fitting regression models using the covariates revealed by the end of the current stage, but this is only practical if the number of covariates is not too large. An alternative approach is to incorporate the score (linear predictor) from the model developed at the previous stage as a covariate at the current stage. This score takes into account the history of the process prior to the stage under consideration. However the score is a function of fitted parameter estimates and therefore contains measurement error. In this paper, we propose a novel technique to account for error in the score. The approach is demonstrated with application to the sprint event in track cycling, and is shown to reduce bias in the estimated effect of the score and avoid unrealistically extreme prediction
    corecore