498 research outputs found

    Flexible Modelling of Discrete Failure Time Including Time-Varying Smooth Effects

    Get PDF
    Discrete survival models have been extended in several ways. More flexible models are obtained by including time-varying coefficients and covariates which determine the hazard rate in an additive but not further specified form. In this paper a general model is considered which comprises both types of covariate effects. An additional extension is the incorporation of smooth interaction between time and covariates. Thus in the linear predictor smooth effects of covariates which may vary across time are allowed. It is shown how simple duration models produce artefacts which may be avoided by flexible models. For the general model which includes parametric terms, time-varying coefficients in parametric terms and time-varying smooth effects estimation procedures are derived which are based on the regularized expansion of smooth effects in basis functions

    Localized Regression

    Get PDF
    The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen dataÂĄadaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures

    Generalized additive modelling with implicit variable selection by likelihood based boosting

    Get PDF
    The use of generalized additive models in statistical data analysis suffers from the restriction to few explanatory variables and the problems of selection of smoothing parameters. Generalized additive model boosting circumvents these problems by means of stagewise fitting of weak learners. A fitting procedure is derived which works for all simple exponential family distributions, including binomial, Poisson and normal response variables. The procedure combines the selection of variables and the determination of the appropriate amount of smoothing. As weak learners penalized regression splines and the newly introduced penalized stumps are considered. Estimates of standard deviations and stopping criteria which are notorious problems in iterative procedures are based on an approximate hat matrix. The method is shown to outperform common procedures for the fitting of generalized additive models. In particular in high dimensional settings it is the only method that works properly

    Boosting Ridge Regression

    Get PDF
    Ridge regression is a well established method to shrink regression parameters towards zero, thereby securing existence of estimates. The present paper investigates several approaches to combining ridge regression with boosting techniques. In the direct approach the ridge estimator is used to fit iteratively the current residuals yielding an alternative to the usual ridge estimator. In partial boosting only part of the regression parameters are reestimated within one step of the iterative procedure. The technique allows to distinguish between variables that are always included in the analysis and variables that are chosen only if relevant. The resulting procedure selects variables in a similar way as the Lasso, yielding a reduced set of influential variables. The suggested procedures are investigated within the classical framework of continuous response variables as well as in the case of generalized linear models. In a simulation study boosting procedures for different stopping criteria are investigated and the performance in terms of prediction and the identification of relevant variables is compared to several competitors as the Lasso and the more recently proposed elastic net

    Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models

    Get PDF
    Abstract Background When predictive survival models are built from high-dimensional data, there are often additional covariates, such as clinical scores, that by all means have to be included into the final model. While there are several techniques for the fitting of sparse high-dimensional survival models by penalized parameter estimation, none allows for explicit consideration of such mandatory covariates. Results We introduce a new boosting algorithm for censored time-to-event data that shares the favorable properties of existing approaches, i.e., it results in sparse models with good prediction performance, but uses an offset-based update mechanism. The latter allows for tailored penalization of the covariates under consideration. Specifically, unpenalized mandatory covariates can be introduced. Microarray survival data from patients with diffuse large B-cell lymphoma, in combination with the recent, bootstrap-based prediction error curve technique, is used to illustrate the advantages of the new procedure. Conclusion It is demonstrated that it can be highly beneficial in terms of prediction performance to use an estimation procedure that incorporates mandatory covariates into high-dimensional survival models. The new approach also allows to answer the question whether improved predictions are obtained by including microarray features in addition to classical clinical criteria.</p

    patterns of change in industrial countries

    Get PDF
    Dieser Aufsatz behandelt die Entwicklung besonders umweltbelastender Branchen seit 1970. Untersucht werden 11 Grundstoffindustrien, die ElektrizitĂ€tserzeugung sowie der StraßengĂŒterverkehr in 32 IndustrielĂ€ndern. Als nötig erweist sich dabei eine ökologische Industriepolitik, da alle anderen Versuche, die Umweltbelastung in IndustrielĂ€ndern zu vermindern – nachsorgender Umweltschutz, Auslagerung in LĂ€nder der Dritten Welt, intersektoraler Strukturwandel und ökologische Modernisierung ohne staatliche Einflußnahme –, bislang lediglich Teilerfolge verzeichneten und die Probleme der ”dirty industries” letztlich nicht lösen konnten.This article concerns itself with the environmental role of heavily polluting industries since 1970, analysing its development in 11 basic industries, as well as electricity production and road transport, in 32 industrial countries. It argues for a green industrial policy, demonstrating that other mitigations of environmental pressure in industrial countries – end-of-pipe treatment, relocation to the Third World, structural change in the industrial sector and even environmentally oriented modernisation – have so far been unable to solve the problems of “dirty industries”, although some approaches have shown (some) promise

    A graphical tool for locating inconsistency in network meta-analyses

    Get PDF
    BACKGROUND: In network meta-analyses, several treatments can be compared by connecting evidence from clinical trials that have investigated two or more treatments. The resulting trial network allows estimating the relative effects of all pairs of treatments taking indirect evidence into account. For a valid analysis of the network, consistent information from different pathways is assumed. Consistency can be checked by contrasting effect estimates from direct comparisons with the evidence of the remaining network. Unfortunately, one deviating direct comparison may have side effects on the network estimates of others, thus producing hot spots of inconsistency. METHODS: We provide a tool, the net heat plot, to render transparent which direct comparisons drive each network estimate and to display hot spots of inconsistency: this permits singling out which of the suspicious direct comparisons are sufficient to explain the presence of inconsistency. We base our methods on fixed-effects models. For disclosure of potential drivers, the plot comprises the contribution of each direct estimate to network estimates resulting from regression diagnostics. In combination, we show heat colors corresponding to the change in agreement between direct and indirect estimate when relaxing the assumption of consistency for one direct comparison. A clustering procedure is applied to the heat matrix in order to find hot spots of inconsistency. RESULTS: The method is shown to work with several examples, which are constructed by perturbing the effect of single study designs, and with two published network meta-analyses. Once the possible sources of inconsistencies are identified, our method also reveals which network estimates they affect. CONCLUSION: Our proposal is seen to be useful for identifying sources of inconsistencies in the network together with the interrelatedness of effect estimates. It opens the way for a further analysis based on subject matter considerations