1,304 research outputs found

    Semiparametric and Additive Model Selection Using an Improved Akaike Information Criterion

    Get PDF
    An improved AIC-based criterion is derived for model selection in general smoothing-based modeling, including semiparametric models and additive models. Examples are provided of applications to goodness-of-fit, smoothing parameter and variable selection in an additive model and semiparametric models, and variable selection in a model with a nonlinear function of linear terms.Statistics Working Papers Serie

    Penalized variable selection procedure for Cox models with semiparametric relative risk

    Full text link
    We study the Cox models with semiparametric relative risk, which can be partially linear with one nonparametric component, or multiple additive or nonadditive nonparametric components. A penalized partial likelihood procedure is proposed to simultaneously estimate the parameters and select variables for both the parametric and the nonparametric parts. Two penalties are applied sequentially. The first penalty, governing the smoothness of the multivariate nonlinear covariate effect function, provides a smoothing spline ANOVA framework that is exploited to derive an empirical model selection tool for the nonparametric part. The second penalty, either the smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO penalty, achieves variable selection in the parametric part. We show that the resulting estimator of the parametric part possesses the oracle property, and that the estimator of the nonparametric part achieves the optimal rate of convergence. The proposed procedures are shown to work well in simulation experiments, and then applied to a real data example on sexually transmitted diseases.Comment: Published in at http://dx.doi.org/10.1214/09-AOS780 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric spectral analysis with applications to seizure characterization using EEG time series

    Full text link
    Understanding the seizure initiation process and its propagation pattern(s) is a critical task in epilepsy research. Characteristics of the pre-seizure electroencephalograms (EEGs) such as oscillating powers and high-frequency activities are believed to be indicative of the seizure onset and spread patterns. In this article, we analyze epileptic EEG time series using nonparametric spectral estimation methods to extract information on seizure-specific power and characteristic frequency [or frequency band(s)]. Because the EEGs may become nonstationary before seizure events, we develop methods for both stationary and local stationary processes. Based on penalized Whittle likelihood, we propose a direct generalized maximum likelihood (GML) and generalized approximate cross-validation (GACV) methods to estimate smoothing parameters in both smoothing spline spectrum estimation of a stationary process and smoothing spline ANOVA time-varying spectrum estimation of a locally stationary process. We also propose permutation methods to test if a locally stationary process is stationary. Extensive simulations indicate that the proposed direct methods, especially the direct GML, are stable and perform better than other existing methods. We apply the proposed methods to the intracranial electroencephalograms (IEEGs) of an epileptic patient to gain insights into the seizure generation process.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS185 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bias in nearest-neighbor hazard estimation

    Get PDF
    In nonparametric curve estimation, the smoothing parameter is critical for performance. In order to estimate the hazard rate, we compare nearest neighbor selectors that minimize the quadratic, the Kullback-Leibler, and the uniform loss. These measures result in a rule of thumb, a cross-validation, and a plug-in selector. A Monte Carlo simulation within the three-parameter exponentiated Weibull distribution indicates that a counter-factual normal distribution, as an input to the selector, does provide a good rule of thumb. If bias is the main concern, minimizing the uniform loss yields the best results, but at the cost of very high variability. Cross-validation has a similar bias to the rule of thumb, but also with high variability. --hazard rate,kernel smoothing,bandwidth selection,nearest neighbor bandwidth,rule of thumb,plug-in,cross-validation,credit risk

    The Minimum S-Divergence Estimator under Continuous Models: The Basu-Lindsay Approach

    Full text link
    Robust inference based on the minimization of statistical divergences has proved to be a useful alternative to the classical maximum likelihood based techniques. Recently Ghosh et al. (2013) proposed a general class of divergence measures for robust statistical inference, named the S-Divergence Family. Ghosh (2014) discussed its asymptotic properties for the discrete model of densities. In the present paper, we develop the asymptotic properties of the proposed minimum S-Divergence estimators under continuous models. Here we use the Basu-Lindsay approach (1994) of smoothing the model densities that, unlike previous approaches, avoids much of the complications of the kernel bandwidth selection. Illustrations are presented to support the performance of the resulting estimators both in terms of efficiency and robustness through extensive simulation studies and real data examples.Comment: Pre-Print, 34 page

    Spatial adaptation in heteroscedastic regression: Propagation approach

    Full text link
    The paper concerns the problem of pointwise adaptive estimation in regression when the noise is heteroscedastic and incorrectly known. The use of the local approximation method, which includes the local polynomial smoothing as a particular case, leads to a finite family of estimators corresponding to different degrees of smoothing. Data-driven choice of localization degree in this case can be understood as the problem of selection from this family. This task can be performed by a suggested in Katkovnik and Spokoiny (2008) FLL technique based on Lepski's method. An important issue with this type of procedures - the choice of certain tuning parameters - was addressed in Spokoiny and Vial (2009). The authors called their approach to the parameter calibration "propagation". In the present paper the propagation approach is developed and justified for the heteroscedastic case in presence of the noise misspecification. Our analysis shows that the adaptive procedure allows a misspecification of the covariance matrix with a relative error of order 1/log(n), where n is the sample size.Comment: 47 pages. This is the final version of the paper published in at http://dx.doi.org/10.1214/08-EJS180 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Test for Multimodality of Regression Derivatives with an Application to Nonparametric Growth Regressions

    Get PDF
    This paper presents a method to test for multimodality of an estimated kernel density of parameter estimates from a local-linear least-squares regression derivative. The procedure is laid out in seven simple steps and a suggestion for implementation is proposed. A Monte Carlo exercise is used to examine the finite sample properties of the test along with those from a calibrated version of it which corrects for the conservative nature of Silverman-type tests. The test is included in a study on nonparametric growth regressions. The results show that in the estimation of unconditional β-convergence, the distribution of the parameter estimates is multimodal with one mode in the negative region (primarily OECD economies) and possibly two modes in the positive region (primarily non-OECD economies) of the parameter estimates. The results for conditional β-convergence show that the density is predominantly negative and unimodal. Finally, the application attempts to determine why particular observations posess positive marginal effects on initial income in both the unconditional and conditional frameworks.Nonparametric Kernel; Convergence; Modality Tests

    Nonparametric Bayesian estimation of a H\"older continuous diffusion coefficient

    Get PDF
    We consider a nonparametric Bayesian approach to estimate the diffusion coefficient of a stochastic differential equation given discrete time observations over a fixed time interval. As a prior on the diffusion coefficient, we employ a histogram-type prior with piecewise constant realisations on bins forming a partition of the time interval. Specifically, these constants are realizations of independent inverse Gamma distributed randoma variables. We justify our approach by deriving the rate at which the corresponding posterior distribution asymptotically concentrates around the data-generating diffusion coefficient. This posterior contraction rate turns out to be optimal for estimation of a H\"older-continuous diffusion coefficient with smoothness parameter 0<λ1.0<\lambda\leq 1. Our approach is straightforward to implement, as the posterior distributions turn out to be inverse Gamma again, and leads to good practical results in a wide range of simulation examples. Finally, we apply our method on exchange rate data sets
    corecore