174 research outputs found
Semiparametric estimation of weighted average derivatives
Bibliography: p. 38-39.Financial support from National Science Foundation Grants.by James L. Powell, James H. Stock, Thomas M. Stoker
[[alternative]]A Modified Cross Validation Methos for Kernel Density Estimation.
計畫編號:NSC93-2118-M032-013研究期間:200408~200507研究經費:276,000[[abstract]]當給定某隨機變數X的一組觀測值時,可利用核估計量(kernel estimator)來估計nXXX,,,21LX的密度函數,它是一種非常簡單且受歡迎的無母數(nonparametric)方法。核估計量的帶寬值大小會影響其平滑度及其估計密度函數準確性。關於帶寬值的選取,Rudemo(1982) 與 Bowman(1984) 提出了一種概念非常簡單且很受歡迎的最小平方交叉有效(least square cross validation)法來選取帶寬值。然而,由於此一方法下的交叉有效函數會受到密度函數的影響而較重視資料密集區域,卻較輕忽了資料稀疏(sparse)的區域。因而最適帶?值的選取會受到取樣點的分佈情形影響,而與密度函數實際被估計的所在區域無關。這樣一來,如果密度函數被估計的區間中,出現相對較稀疏的取樣區域,那麼所選取出的帶寬值便會顯得太小,不適合用來計算此種區域的核估計量。此區域的密度函數估計線會因而顯得較崎嶇,甚至發生核估計量必須被設定(clipped)為0的情形。有關上述交叉有傚法的缺點,請參閱Scott and Terrel (1987) and Chiu (1991);關於密度函數核估計量之其他交叉有傚法的介紹和比較,則請參閱專書 Wand and Jones (1995)。 )(xf根據上述的考量,在最小化積分平方差(integrated square error)的想法下,本計劃將結合積分及k-nn法則(k-th nearest neighbor rule),建構一個新的交叉有效函數來修正此一缺點。不難理解,由於這個積分形式的交叉有效函數利用k-nn法則,因此密度函數對帶寬值選取的影響程度會減輕,而其選取出的帶寬值將有助於得出較平滑且較準確的核估計量。 在理論上,我們將推導出這個交叉有效函數所選取出的帶寬值,與最小化積分平方差的帶寬值h,兩者之間的近似關係。在模擬研究方面,我們將以實際世界的資料(real-world data),陳示上述最小平方交叉有傚法選取帶寬值所可能發生的缺點,並以使用的核估計量,來檢視積分交叉有傚法選取帶寬值的效果。此外,本計劃亦將以電腦模擬出的資料,利用數值方法及積分平方差準則,來研究積分式交叉有傚法,所選取出的帶寬值,在實務上及理論上的估計效果。[[sponsorship]]行政院國家科學委員
The bootstrap -A review
The bootstrap, extensively studied during the last decade, has become a powerful tool in different areas of Statistical Inference. In this work, we present the main ideas of bootstrap methodology in several contexts, citing the most relevant contributions and illustrating with examples and simulation studies some interesting aspects
Three Sides of Smoothing: Categorical Data Smoothing, Nonparametric Regression, and Density Estimation
The past forty years have seen a great deal of research into the construction and properties of nonparametric
estimates of smooth functions. This research has focused primarily on two sides of the smoothing
problem: nonparametric regression and density estimation. Theoretical results for these two situations
are similar, and multivariate density estimation was an early justification for the Nadaraya-Watson
kernel regression estimator.
A third, less well-explored, strand of applications of smoothing is to the estimation of probabilities in
categorical data. In this paper the position of categorical data smoothing as a bridge between nonparametric
regression and density estimation is explored. Nonparametric regression provides a paradigm
for the construction of effective categorical smoothing estimates, and use of an appropriate likelihood
function yields cell probability estimates with many desirable properties. Such estimates can be used
to construct regression estimates when one or more of the categorical variables are viewed as response
variables. They also lead naturally to the construction of well-behaved density estimates using local or
penalized likelihood estimation, which can then be used in a regression context. Several real data sets are
used to illustrate these points.Statistics Working Papers Serie
Two-Step Estimation and Inference with Possibly Many Included Covariates
We study the implications of including many covariates in a first-step
estimate entering a two-step estimation procedure. We find that a first order
bias emerges when the number of \textit{included} covariates is "large"
relative to the square-root of sample size, rendering standard inference
procedures invalid. We show that the jackknife is able to estimate this "many
covariates" bias consistently, thereby delivering a new automatic
bias-corrected two-step point estimator. The jackknife also consistently
estimates the standard error of the original two-step point estimator. For
inference, we develop a valid post-bias-correction bootstrap approximation that
accounts for the additional variability introduced by the jackknife
bias-correction. We find that the jackknife bias-corrected point estimator and
the bootstrap post-bias-correction inference perform excellent in simulations,
offering important improvements over conventional two-step point estimators and
inference procedures, which are not robust to including many covariates. We
apply our results to an array of distinct treatment effect, policy evaluation,
and other applied microeconomics settings. In particular, we discuss production
function and marginal treatment effect estimation in detail
- …