127 research outputs found
Bandwidth selection for smooth backfitting in additive models
The smooth backfitting introduced by Mammen, Linton and Nielsen [Ann.
Statist. 27 (1999) 1443-1490] is a promising technique to fit additive
regression models and is known to achieve the oracle efficiency bound. In this
paper, we propose and discuss three fully automated bandwidth selection methods
for smooth backfitting in additive models. The first one is a penalized least
squares approach which is based on higher-order stochastic expansions for the
residual sums of squares of the smooth backfitting estimates. The other two are
plug-in bandwidth selectors which rely on approximations of the average squared
errors and whose utility is restricted to local linear fitting. The large
sample properties of these bandwidth selection methods are given. Their finite
sample properties are also compared through simulation experiments.Comment: Published at http://dx.doi.org/10.1214/009053605000000101 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A simple smooth backfitting method for additive models
In this paper a new smooth backfitting estimate is proposed for additive
regression models. The estimate has the simple structure of Nadaraya--Watson
smooth backfitting but at the same time achieves the oracle property of local
linear smooth backfitting. Each component is estimated with the same asymptotic
accuracy as if the other components were known.Comment: Published at http://dx.doi.org/10.1214/009053606000000696 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Smooth backfitting in generalized additive models
Generalized additive models have been popular among statisticians and data
analysts in multivariate nonparametric regression with non-Gaussian responses
including binary and count data. In this paper, a new likelihood approach for
fitting generalized additive models is proposed. It aims to maximize a smoothed
likelihood. The additive functions are estimated by solving a system of
nonlinear integral equations. An iterative algorithm based on smooth
backfitting is developed from the Newton--Kantorovich theorem. Asymptotic
properties of the estimator and convergence of the algorithm are discussed. It
is shown that our proposal based on local linear fit achieves the same bias and
variance as the oracle estimator that uses knowledge of the other components.
Numerical comparison with the recently proposed two-stage estimator [Ann.
Statist. 32 (2004) 2412--2443] is also made.Comment: Published in at http://dx.doi.org/10.1214/009053607000000596 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Limit Distribution of Convex-Hull Estimators of Boundaries
Given n independent and identically distributed observations in a set G with an unknown function g, called a boundary or frontier, it is desired to estimate g from the observations. The problem has several important applications including classification and cluster analysis, and is closely related to edge estimation in image reconstruction. It is particularly important in econometrics. The convex-hull estimator of a boundary or frontier is very popular in econometrics, where it is a cornerstone of a method known as `data envelope analysis´ or DEA. In this paper we give a large sample approximation of the distribution of the convex-hull estimator in the general case where p>=1. We discuss ways of using the large sample approximation to correct the bias of the convex-hull and the DEA estimators and to construct confidence intervals for the true function. --Convex-hull,free disposal hull,frontier function,data envelope analysis,productivity analysis,rate of convergence
Semi-parametric regression: Efficiency gains from modeling the nonparametric part
It is widely admitted that structured nonparametric modeling that circumvents
the curse of dimensionality is important in nonparametric estimation. In this
paper we show that the same holds for semi-parametric estimation. We argue that
estimation of the parametric component of a semi-parametric model can be
improved essentially when more structure is put into the nonparametric part of
the model. We illustrate this for the partially linear model, and investigate
efficiency gains when the nonparametric part of the model has an additive
structure. We present the semi-parametric Fisher information bound for
estimating the parametric part of the partially linear additive model and
provide semi-parametric efficient estimators for which we use a smooth
backfitting technique to deal with the additive nonparametric part. We also
present the finite sample performances of the proposed estimators and analyze
Boston housing data as an illustration.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ296 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Backfitting and smooth backfitting for additive quantile models
In this paper, we study the ordinary backfitting and smooth backfitting as
methods of fitting additive quantile models. We show that these backfitting
quantile estimators are asymptotically equivalent to the corresponding
backfitting estimators of the additive components in a specially-designed
additive mean regression model. This implies that the theoretical properties of
the backfitting quantile estimators are not unlike those of backfitting mean
regression estimators. We also assess the finite sample properties of the two
backfitting quantile estimators.Comment: Published in at http://dx.doi.org/10.1214/10-AOS808 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org). With Correction
Asymptotic distribution of conical-hull estimators of directional edges
Nonparametric data envelopment analysis (DEA) estimators have been widely
applied in analysis of productive efficiency. Typically they are defined in
terms of convex-hulls of the observed combinations of
in a sample of enterprises. The shape
of the convex-hull relies on a hypothesis on the shape of the technology,
defined as the boundary of the set of technically attainable points in the
space. So far, only the statistical
properties of the smallest convex polyhedron enveloping the data points has
been considered which corresponds to a situation where the technology presents
variable returns-to-scale (VRS). This paper analyzes the case where the most
common constant returns-to-scale (CRS) hypothesis is assumed. Here the DEA is
defined as the smallest conical-hull with vertex at the origin enveloping the
cloud of observed points. In this paper we determine the asymptotic properties
of this estimator, showing that the rate of convergence is better than for the
VRS estimator. We derive also its asymptotic sampling distribution with a
practical way to simulate it. This allows to define a bias-corrected estimator
and to build confidence intervals for the frontier. We compare in a simulated
example the bias-corrected estimator with the original conical-hull estimator
and show its superiority in terms of median squared error.Comment: Published in at http://dx.doi.org/10.1214/09-AOS746 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Local Likelihood Estimation of Truncated Regression and Its Partial Derivatives: Theory and Application
In this paper we propose a very flexible estimator in the context of truncated regression that does not require parametric assumptions. To do this, we adapt the theory of local maximum likelihood estimation. We provide the asymptotic results and illustrate the performance of our estimator on simulated and real data sets. Our estimator performs as good as the fully parametric estimator when the assumptions for the latter hold, but as expected, much better when they do not (provided that the curse of dimensionality problem is not the issue). Overall, our estimator exhibits a fair degree of robustness to various deviations from linearity in the regression equation and also to deviations from the specification of the error term. So the approach shall prove to be very useful in practical applications, where the parametric form of the regression or of the distribution is rarely known.Nonparametric Truncated Regression, Local Likelihood
Flexible generalized varying coefficient regression models
This paper studies a very flexible model that can be used widely to analyze
the relation between a response and multiple covariates. The model is
nonparametric, yet renders easy interpretation for the effects of the
covariates. The model accommodates both continuous and discrete random
variables for the response and covariates. It is quite flexible to cover the
generalized varying coefficient models and the generalized additive models as
special cases. Under a weak condition we give a general theorem that the
problem of estimating the multivariate mean function is equivalent to that of
estimating its univariate component functions. We discuss implications of the
theorem for sieve and penalized least squares estimators, and then investigate
the outcomes in full details for a kernel-type estimator. The kernel estimator
is given as a solution of a system of nonlinear integral equations. We provide
an iterative algorithm to solve the system of equations and discuss the
theoretical properties of the estimator and the algorithm. Finally, we give
simulation results.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1026 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
New methods for bias correction at endpoints and boundaries
We suggest two new, translation-based methods for estimating and correcting for bias when estimating the edge of a distribution. The first uses an empirical translation applied to the argument of the kernel, in order to remove the main effects of the asymmetries that are inherent when constructing estimators at boundaries. Placing the translation inside the kernel is in marked contrast to traditional approaches, such as the use of high-order kernels, which are related to the jackknife and, in effect, apply the translation outside the kernel. Our approach has the advantage of producing bias estimators that, while enjoying a high order of accuracy, are guaranteed to respect the sign of bias. Our second method is a new bootstrap technique. It involves translating an initial boundary estimate toward the body of the dataset, constructing repeated boundary estimates from data that lie below the respective translations, and employing averages of the resulting empirical bias approximations to estimate the bias of the original estimator. The first of the two methods is most appropriate in univariate cases, and is studied there; the second approach may be used to bias-correct estimates of boundaries of multivariate distributions, and is explored in the bivariate case
- …