1,304 research outputs found
Semiparametric and Additive Model Selection Using an Improved Akaike Information Criterion
An improved AIC-based criterion is derived for model selection in general smoothing-based
modeling, including semiparametric models and additive models. Examples are
provided of applications to goodness-of-fit, smoothing parameter and variable selection
in an additive model and semiparametric models, and variable selection in a model with
a nonlinear function of linear terms.Statistics Working Papers Serie
Penalized variable selection procedure for Cox models with semiparametric relative risk
We study the Cox models with semiparametric relative risk, which can be
partially linear with one nonparametric component, or multiple additive or
nonadditive nonparametric components. A penalized partial likelihood procedure
is proposed to simultaneously estimate the parameters and select variables for
both the parametric and the nonparametric parts. Two penalties are applied
sequentially. The first penalty, governing the smoothness of the multivariate
nonlinear covariate effect function, provides a smoothing spline ANOVA
framework that is exploited to derive an empirical model selection tool for the
nonparametric part. The second penalty, either the
smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO
penalty, achieves variable selection in the parametric part. We show that the
resulting estimator of the parametric part possesses the oracle property, and
that the estimator of the nonparametric part achieves the optimal rate of
convergence. The proposed procedures are shown to work well in simulation
experiments, and then applied to a real data example on sexually transmitted
diseases.Comment: Published in at http://dx.doi.org/10.1214/09-AOS780 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Nonparametric spectral analysis with applications to seizure characterization using EEG time series
Understanding the seizure initiation process and its propagation pattern(s)
is a critical task in epilepsy research. Characteristics of the pre-seizure
electroencephalograms (EEGs) such as oscillating powers and high-frequency
activities are believed to be indicative of the seizure onset and spread
patterns. In this article, we analyze epileptic EEG time series using
nonparametric spectral estimation methods to extract information on
seizure-specific power and characteristic frequency [or frequency band(s)].
Because the EEGs may become nonstationary before seizure events, we develop
methods for both stationary and local stationary processes. Based on penalized
Whittle likelihood, we propose a direct generalized maximum likelihood (GML)
and generalized approximate cross-validation (GACV) methods to estimate
smoothing parameters in both smoothing spline spectrum estimation of a
stationary process and smoothing spline ANOVA time-varying spectrum estimation
of a locally stationary process. We also propose permutation methods to test if
a locally stationary process is stationary. Extensive simulations indicate that
the proposed direct methods, especially the direct GML, are stable and perform
better than other existing methods. We apply the proposed methods to the
intracranial electroencephalograms (IEEGs) of an epileptic patient to gain
insights into the seizure generation process.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS185 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Bias in nearest-neighbor hazard estimation
In nonparametric curve estimation, the smoothing parameter is critical for performance. In order to estimate the hazard rate, we compare nearest neighbor selectors that minimize the quadratic, the Kullback-Leibler, and the uniform loss. These measures result in a rule of thumb, a cross-validation, and a plug-in selector. A Monte Carlo simulation within the three-parameter exponentiated Weibull distribution indicates that a counter-factual normal distribution, as an input to the selector, does provide a good rule of thumb. If bias is the main concern, minimizing the uniform loss yields the best results, but at the cost of very high variability. Cross-validation has a similar bias to the rule of thumb, but also with high variability. --hazard rate,kernel smoothing,bandwidth selection,nearest neighbor bandwidth,rule of thumb,plug-in,cross-validation,credit risk
The Minimum S-Divergence Estimator under Continuous Models: The Basu-Lindsay Approach
Robust inference based on the minimization of statistical divergences has
proved to be a useful alternative to the classical maximum likelihood based
techniques. Recently Ghosh et al. (2013) proposed a general class of divergence
measures for robust statistical inference, named the S-Divergence Family. Ghosh
(2014) discussed its asymptotic properties for the discrete model of densities.
In the present paper, we develop the asymptotic properties of the proposed
minimum S-Divergence estimators under continuous models. Here we use the
Basu-Lindsay approach (1994) of smoothing the model densities that, unlike
previous approaches, avoids much of the complications of the kernel bandwidth
selection. Illustrations are presented to support the performance of the
resulting estimators both in terms of efficiency and robustness through
extensive simulation studies and real data examples.Comment: Pre-Print, 34 page
Spatial adaptation in heteroscedastic regression: Propagation approach
The paper concerns the problem of pointwise adaptive estimation in regression
when the noise is heteroscedastic and incorrectly known. The use of the local
approximation method, which includes the local polynomial smoothing as a
particular case, leads to a finite family of estimators corresponding to
different degrees of smoothing. Data-driven choice of localization degree in
this case can be understood as the problem of selection from this family. This
task can be performed by a suggested in Katkovnik and Spokoiny (2008) FLL
technique based on Lepski's method. An important issue with this type of
procedures - the choice of certain tuning parameters - was addressed in
Spokoiny and Vial (2009). The authors called their approach to the parameter
calibration "propagation". In the present paper the propagation approach is
developed and justified for the heteroscedastic case in presence of the noise
misspecification. Our analysis shows that the adaptive procedure allows a
misspecification of the covariance matrix with a relative error of order
1/log(n), where n is the sample size.Comment: 47 pages. This is the final version of the paper published in at
http://dx.doi.org/10.1214/08-EJS180 the Electronic Journal of Statistics
(http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics
(http://www.imstat.org
A Test for Multimodality of Regression Derivatives with an Application to Nonparametric Growth Regressions
This paper presents a method to test for multimodality of an estimated kernel density of parameter estimates from a local-linear least-squares regression derivative. The procedure is laid out in seven simple steps and a suggestion for implementation is proposed. A Monte Carlo exercise is used to examine the finite sample properties of the test along with those from a calibrated version of it which corrects for the conservative nature of Silverman-type tests. The test is included in a study on nonparametric growth regressions. The results show that in the estimation of unconditional β-convergence, the distribution of the parameter estimates is multimodal with one mode in the negative region (primarily OECD economies) and possibly two modes in the positive region (primarily non-OECD economies) of the parameter estimates. The results for conditional β-convergence show that the density is predominantly negative and unimodal. Finally, the application attempts to determine why particular observations posess positive marginal effects on initial income in both the unconditional and conditional frameworks.Nonparametric Kernel; Convergence; Modality Tests
Nonparametric Bayesian estimation of a H\"older continuous diffusion coefficient
We consider a nonparametric Bayesian approach to estimate the diffusion
coefficient of a stochastic differential equation given discrete time
observations over a fixed time interval. As a prior on the diffusion
coefficient, we employ a histogram-type prior with piecewise constant
realisations on bins forming a partition of the time interval. Specifically,
these constants are realizations of independent inverse Gamma distributed
randoma variables. We justify our approach by deriving the rate at which the
corresponding posterior distribution asymptotically concentrates around the
data-generating diffusion coefficient. This posterior contraction rate turns
out to be optimal for estimation of a H\"older-continuous diffusion coefficient
with smoothness parameter Our approach is straightforward to
implement, as the posterior distributions turn out to be inverse Gamma again,
and leads to good practical results in a wide range of simulation examples.
Finally, we apply our method on exchange rate data sets
- …