132 research outputs found
Testing the suitability of polynomial models in errors-in-variables problems
A low-degree polynomial model for a response curve is used commonly in
practice. It generally incorporates a linear or quadratic function of the
covariate. In this paper we suggest methods for testing the goodness of fit of
a general polynomial model when there are errors in the covariates. There, the
true covariates are not directly observed, and conventional bootstrap methods
for testing are not applicable. We develop a new approach, in which
deconvolution methods are used to estimate the distribution of the covariates
under the null hypothesis, and a ``wild'' or moment-matching bootstrap argument
is employed to estimate the distribution of the experimental errors (distinct
from the distribution of the errors in covariates). Most of our attention is
directed at the case where the distribution of the errors in covariates is
known, although we also discuss methods for estimation and testing when the
covariate error distribution is estimated. No assumptions are made about the
distribution of experimental error, and, in particular, we depart substantially
from conventional parametric models for errors-in-variables problems.Comment: Published in at http://dx.doi.org/10.1214/009053607000000361 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Variable selection in measurement error models
Measurement error data or errors-in-variable data have been collected in many
studies. Natural criterion functions are often unavailable for general
functional measurement error models due to the lack of information on the
distribution of the unobservable covariates. Typically, the parameter
estimation is via solving estimating equations. In addition, the construction
of such estimating equations routinely requires solving integral equations,
hence the computation is often much more intensive compared with ordinary
regression models. Because of these difficulties, traditional best subset
variable selection procedures are not applicable, and in the measurement error
model context, variable selection remains an unsolved issue. In this paper, we
develop a framework for variable selection in measurement error models via
penalized estimating equations. We first propose a class of selection
procedures for general parametric measurement error models and for general
semi-parametric measurement error models, and study the asymptotic properties
of the proposed procedures. Then, under certain regularity conditions and with
a properly chosen regularization parameter, we demonstrate that the proposed
procedure performs as well as an oracle procedure. We assess the finite sample
performance via Monte Carlo simulation studies and illustrate the proposed
methodology through the empirical analysis of a familiar data set.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ205 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Optimal variance estimation without estimating the mean function
We study the least squares estimator in the residual variance estimation
context. We show that the mean squared differences of paired observations are
asymptotically normally distributed. We further establish that, by regressing
the mean squared differences of these paired observations on the squared
distances between paired covariates via a simple least squares procedure, the
resulting variance estimator is not only asymptotically normal and root-
consistent, but also reaches the optimal bound in terms of estimation variance.
We also demonstrate the advantage of the least squares estimator in comparison
with existing methods in terms of the second order asymptotic properties.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ432 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index
We propose a generalized partially linear functional single index risk score
model for repeatedly measured outcomes where the index itself is a function of
time. We fuse the nonparametric kernel method and regression spline method, and
modify the generalized estimating equation to facilitate estimation and
inference. We use local smoothing kernel to estimate the unspecified
coefficient functions of time, and use B-splines to estimate the unspecified
function of the single index component. The covariance structure is taken into
account via a working model, which provides valid estimation and inference
procedure whether or not it captures the true covariance. The estimation method
is applicable to both continuous and discrete outcomes. We derive large sample
properties of the estimation procedure and show a different convergence rate
for each component of the model. The asymptotic properties when the kernel and
regression spline methods are combined in a nested fashion has not been studied
prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A SPLINE-ASSISTED SEMIPARAMETRIC APPROACH TO NONPARAMETRIC MEASUREMENT ERROR MODELS
Nonparametric estimation of the probability density function of a random variable measured with error is considered to be a difficult problem, in the sense that depending on the measurement error prop- erty, the estimation rate can be as slow as the logarithm of the sample size. Likewise, nonparametric estimation of the regression function with errors in the covariate suffers the same possibly slow rate. The traditional methods for both problems are based on deconvolution, where the slow convergence rate is caused by the quick convergence to zero of the Fourier transform of the measurement error density, which, unfortunately, appears in the denominators during the construction of these methods. Using a completely different approach of spline-assisted semiparametric methods, we are able to construct nonparametric estimators of both density functions and regression mean functions that achieve the same nonparametric convergence rate as in the error free case. Other than requiring the error-prone variable distribution to be compactly supported, our assumptions are not stronger than in the classical deconvolution literatures. The performance of these methods are demonstrated through some simulations and a data example
Testing for high-dimensional white noise
Testing for multi-dimensional white noise is an important subject in
statistical inference. Such test in the high-dimensional case becomes an open
problem waiting to be solved, especially when the dimension of a time series is
comparable to or even greater than the sample size. To detect an arbitrary form
of departure from high-dimensional white noise, a few tests have been
developed. Some of these tests are based on max-type statistics, while others
are based on sum-type ones. Despite the progress, an urgent issue awaits to be
resolved: none of these tests is robust to the sparsity of the serial
correlation structure. Motivated by this, we propose a Fisher's combination
test by combining the max-type and the sum-type statistics, based on the
established asymptotically independence between them. This combination test can
achieve robustness to the sparsity of the serial correlation structure,and
combine the advantages of the two types of tests. We demonstrate the advantages
of the proposed test over some existing tests through extensive numerical
results and an empirical analysis.Comment: 84 page
- …