thesis

Sequential procedures for nonparametric kernel regression

Abstract

In a nonparametric setting, the functional form of the relationship between the response variable and the associated predictor variables is unspecified; however it is assumed to be a smooth function. The main aim of nonparametric regression is to highlight an important structure in data without any assumptions about the shape of an underlying regression function. In regression, the random and fixed design models should be distinguished. Among the variety of nonparametric regression estimators currently in use, kernel type estimators are most popular. Kernel type estimators provide a flexible class of nonparametric procedures by estimating unknown function as a weighted average using a kernel function. The bandwidth which determines the influence of the kernel has to be adapted to any kernel type estimator. Our focus is on Nadaraya-Watson estimator and Local Linear estimator whic h belong to a class of kernel type regression estimators called local polynomial kernel estimators. A closely related problem is the determination of an appropriate sample size that would be required to achieve a desired confidence level of accuracy for the nonparametric regression estimators. Since sequential procedures allow an experimenter to make decisions based on the smallest number of observations without compromising accuracy, application of sequential procedures to a nonparametric regression model at a given point or series of points is considered. The motivation for using such procedures is: in many applications the quality of estimating an underlying regression function in a controlled experiment is paramount; thus, it is reasonable to invoke a sequential procedure of estimation that chooses a sample size based on recorded observations that guarantees a preassigned accuracy. We have employed sequential techniques to develop a procedure for constructing a fixed-width confidence interval for the predicted value at a specific point of the independent variable. These fixed-width confidence intervals are developed using asymptotic properties of both Nadaraya-Watson and local linear kernel estimators of nonparametric kernel regression with data-driven bandwidths and studied for both fixed and random design contexts. The sample sizes for a preset confidence coefficient are optimized using sequential procedures, namely two-stage procedure, modified two-stage procedure and purely sequential procedure. The proposed methodology is first tested by employing a large-scale simulation study. The performance of each kernel estimation method is assessed by comparing their coverage accuracy with corresponding preset confidence coefficients, proximity of computed sample sizes match up to optimal sample sizes and contrasting the estimated values obtained from the two nonparametric methods with act ual values at given series of design points of interest. We also employed the symmetric bootstrap method which is considered as an alternative method of estimating properties of unknown distributions. Resampling is done from a suitably estimated residual distribution and utilizes the percentiles of the approximate distribution to construct confidence intervals for the curve at a set of given design points. A methodology is developed for determining whether it is advantageous to use the symmetric bootstrap method to reduce the extent of oversampling that is normally known to plague Stein's two-stage sequential procedure. The procedure developed is validated using an extensive simulation study and we also explore the asymptotic properties of the relevant estimators. Finally, application of our proposed sequential nonparametric kernel regression methods are made to some problems in software reliability and finance

    Similar works