30,718 research outputs found

    Semiparametric Stepwise Regression to Estimate Sales Promotion Effects

    Get PDF
    Kalyanam and Shively (1998) and van Heerde et al. (2001) have proposed semiparametric models to estimate the influence of price promotions on brand sales, and both obtained superior performance for their models compared to strictly parametric modeling. Following these researchers, we suggest another semiparametric framework which is based on penalized B-splines to analyze sales promotion effects flexibly. Unlike these researchers, we introduce a stepwise procedure with simultaneous smoothing parameter choice for variable selection. Applying this stepwise routine enables us to deal with product categories with many competitive items without imposing restrictions on the competitive market structure in advance. We illustrate the new methodology in an empirical application using weekly store-level scanner data

    Variable selection for BART: An application to gene regulation

    Get PDF
    We consider the task of discovering gene regulatory networks, which are defined as sets of genes and the corresponding transcription factors which regulate their expression levels. This can be viewed as a variable selection problem, potentially with high dimensionality. Variable selection is especially challenging in high-dimensional settings, where it is difficult to detect subtle individual effects and interactions between predictors. Bayesian Additive Regression Trees [BART, Ann. Appl. Stat. 4 (2010) 266-298] provides a novel nonparametric alternative to parametric regression approaches, such as the lasso or stepwise regression, especially when the number of relevant predictors is sparse relative to the total number of available predictors and the fundamental relationships are nonlinear. We develop a principled permutation-based inferential approach for determining when the effect of a selected predictor is likely to be real. Going further, we adapt the BART procedure to incorporate informed prior information about variable importance. We present simulations demonstrating that our method compares favorably to existing parametric and nonparametric procedures in a variety of data settings. To demonstrate the potential of our approach in a biological context, we apply it to the task of inferring the gene regulatory network in yeast (Saccharomyces cerevisiae). We find that our BART-based procedure is best able to recover the subset of covariates with the largest signal compared to other variable selection methods. The methods developed in this work are readily available in the R package bartMachine.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS755 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Determination of airplane model structure from flight data by using modified stepwise regression

    Get PDF
    The linear and stepwise regressions are briefly introduced, then the problem of determining airplane model structure is addressed. The MSR was constructed to force a linear model for the aerodynamic coefficient first, then add significant nonlinear terms and delete nonsignificant terms from the model. In addition to the statistical criteria in the stepwise regression, the prediction sum of squares (PRESS) criterion and the analysis of residuals were examined for the selection of an adequate model. The procedure is used in examples with simulated and real flight data. It is shown that the MSR performs better than the ordinary stepwise regression and that the technique can also be applied to the large amplitude maneuvers

    A methodology for airplane parameter estimation and confidence interval determination in nonlinear estimation problems

    Get PDF
    An algorithm for maximum likelihood (ML) estimation is developed with an efficient method for approximating the sensitivities. The ML algorithm relies on a new optimization method referred to as a modified Newton-Raphson with estimated sensitivities (MNRES). MNRES determines sensitivities by using slope information from local surface approximations of each output variable in parameter space. With the fitted surface, sensitivity information can be updated at each iteration with less computational effort than that required by either a finite-difference method or integration of the analytically determined sensitivity equations. MNRES eliminates the need to derive sensitivity equations for each new model, and thus provides flexibility to use model equations in any convenient format. A random search technique for determining the confidence limits of ML parameter estimates is applied to nonlinear estimation problems for airplanes. The confidence intervals obtained by the search are compared with Cramer-Rao (CR) bounds at the same confidence level. The degree of nonlinearity in the estimation problem is an important factor in the relationship between CR bounds and the error bounds determined by the search technique. Beale's measure of nonlinearity is developed in this study for airplane identification problems; it is used to empirically correct confidence levels and to predict the degree of agreement between CR bounds and search estimates

    Persistence of Regional Unemployment: Application of a Spatial Filtering Approach to Local Labour Markets in Germany

    Get PDF
    The geographical distribution and persistence of regional/local unemployment rates in heterogeneous economies (such as Germany) have been, in recent years, the subject of various theoretical and empirical studies. Several researchers have shown an interest in analysing the dynamic adjustment processes of unemployment and the average degree of dependence of the current unemployment rates or gross domestic product from the ones observed in the past. In this paper, we present a new econometric approach to the study of regional unemployment persistence, in order to account for spatial heterogeneity and/or spatial autocorrelation in both the levels and the dynamics of unemployment. First, we propose an econometric procedure suggesting the use of spatial filtering techniques as a substitute for fixed effects in a panel estimation framework. The spatial filter computed here is a proxy for spatially distributed region-specific information (e.g., the endowment of natural resources, or the size of the ‘home market’) that is usually incorporated in the fixed effects parameters. The advantages of our proposed procedure are that the spatial filter, by incorporating region-specific information that generates spatial autocorrelation, frees up degrees of freedom, simultaneously corrects for time-stable spatial autocorrelation in the residuals, and provides insights about the spatial patterns in regional adjustment processes. We present several experiments in order to investigate the spatial pattern of the heterogeneous autoregressive parameters estimated for unemployment data for German NUTS-3 regions. We find widely heterogeneous but generally high persistence in regional unemployment rates.

    Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach

    Get PDF
    In non-linear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of 'hold-out' or 'split-sample' data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. First, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

    Variable Selection and Model Choice in Structured Survival Models

    Get PDF
    In many situations, medical applications ask for flexible survival models that allow to extend the classical Cox-model via the inclusion of time-varying and nonparametric effects. These structured survival models are very flexible but additional difficulties arise when model choice and variable selection is desired. In particular, it has to be decided which covariates should be assigned time-varying effects or whether parametric modeling is sufficient for a given covariate. Component-wise boosting provides a means of likelihood-based model fitting that enables simultaneous variable selection and model choice. We introduce a component-wise likelihood-based boosting algorithm for survival data that permits the inclusion of both parametric and nonparametric time-varying effects as well as nonparametric effects of continuous covariates utilizing penalized splines as the main modeling technique. Its properties and performance are investigated in simulation studies. The new modeling approach is used to build a flexible survival model for intensive care patients suffering from severe sepsis. A software implementation is available to the interested reader

    Persistence of regional unemployment : Application of a spatial filtering approach to local labour markets in Germany

    Get PDF
    "The geographical distribution and persistence of regional/local unemployment rates in heterogeneous economies (such as Germany) have been, in recent years, the subject of various theoretical and empirical studies. Several researchers have shown an interest in analysing the dynamic adjustment processes of unemployment and the average degree of dependence of the current unemployment rates or gross domestic product from the ones observed in the past. In this paper, we present a new econometric approach to the study of regional unemployment persistence, in order to account for spatial heterogeneity and/or spatial autocorrelation in both the levels and the dynamics of unemployment. First, we propose an econometric procedure suggesting the use of spatial filtering techniques as a substitute for fixed effects in a panel estimation framework. The spatial filter computed here is a proxy for spatially distributed region-specific information (e.g., the endowment of natural resources, or the size of the 'home market') that is usually incorporated in the fixed effects parameters. The same argument applies for the spatial filter modelling of the heterogenous dynamics. The advantages of our proposed procedure are that the spatial filter, by incorporating region-specific information that generates spatial autocorrelation, frees up degrees of freedom, simultaneously corrects for time-stable spatial autocorrelation in the residuals, and provides insights about the spatial patterns in regional adjustment processes. We present several experiments in order to investigate the spatial pattern of the heterogeneous autoregressive parameters estimated for unemployment data for German NUTS-3 regions. We find widely heterogeneous but generally high persistence in regional unemployment rates." (Author's abstract, IAB-Doku) ((en))Arbeitslosenquote, Persistenz, Schätzung, regionale Disparität

    Modelling evapotranspiration of soilless cut roses "Red Naomi" based on climatic and crop predictors

    Get PDF
    Original PaperThis study aimed to estimate the daily crop evapotranspiration (ETc) of soilless cut ‘Red Naomi’ roses, cultivated in a commercial glass greenhouse, using climatic and crop predictors. A multiple stepwise regression technique was applied for estimating ETc using the daily relative humidity, stem leaf area and number of leaves of the bended stems. The model explained 90% of the daily ETc variability (R2 = 0.90, n = 33, P < 0.0001) measured by weighing lysimeters. The mean relative difference between the observed and the estimated daily ETc was 9.1%. The methodology revealed a high accuracy and precision in the estimation of daily ETcinfo:eu-repo/semantics/publishedVersio
    corecore