114 research outputs found

    Erweiterung inferenzstatistischer Fähigkeiten modellbasierter Gradient-Boosting Algorithmen

    Get PDF
    Background and aims The rapid development of computer technology in recent decades has not only enabled the practical application of previously merely theoretical ideas of statistical data analysis but has also led to a multitude of new and increasingly computationally intensive analysis strategies emerging from the field of machine learning. Further developments of the successful boosting algorithms revealed their relationship to known statistical concepts and made them usable for the estimation of regularized regression parameters of additive models. This thesis focuses on the resulting model properties as well as their improvement and extension with regard to inferential statistical validity and interpretability. Methods All presented approaches address various forms of model-based boosting algorithms. The algorithm is initialized with an empty model, which is sequentially updated in the following iterations by repeated application of small regression functions to build a final additive model. This thesis examines the resulting estimators and model properties in comparison with other regularization methods such as L1L_1-penalization. In addition, alternative strategies for improving the variable selection properties of the overall model are proposed and strategies for testing individual effects are developed. For this purpose, variants of variable permutation and bootstrapping methods are developed. Results Regularization of linear effect estimators by means of model-based gradient boostings exhibits asymptotic behaviour to L1L_1-penalization with decreasing learning rate ν\nu if and only if the inverse covariance matrix of the predictor variables is diagonally dominant. Differences between the methods can be traced back to the sequential aggregation of the boosting model, which stabilizes the regularization paths but makes the models relatively larger. Therefore, in order to avoid a large number of false positive selections, the focus can be shifted from the prediction to variable section accuracy by extending the dataset with permutations of the predictor variables. Residual permutation and parametric bootstrap allow the computation of p-values with test power on par with Wald-tests for maximum likelihood estimators in low-dimensional scenarios. Practical conclusions The results of this work provide a guideline for the choice between boosting and L1L_1-penalty as regularization method for statistical models. In addition, the applicability of model-based gradient-boosting algorithms is improved in situations where more detailled interpretation of the selected variables is of central interest. The reliability of true informative value of selected variables is increased by using alternative tuning via permuted variables. Moreover, making use of the parametric bootstrap allows for the first time the calculation of p-values for single effect estimators of gradient boosting algorithms in high dimensional scenarios with correlated predictor variables.Hintergrund und Ziele Die rasante technologische Entwicklung der vergangenen Jahrzehnte ermöglichte nicht nur die praktische Anwendung zuvor lediglich theoretischer Konzepte der statistischen Datenanalyse sondern führte auch zu einer Vielzahl neuer, zunehmend rechenintensiven Analysestrategien aus dem Umfeld des maschinellen Lernens. Weiterentwicklungen der erfolgreichen Boosting-Algorithmen offenbarten deren Nähe zu bekannten statistischen Konzepten und machten diese für die Schätzung regularisierter Regressionsparameter additiver Modelle nutzbar. Die vorliegende Arbeit richtet den Fokus auf die daraus resultierenden Modelleigenschaften sowie deren Verbesserung und Erweiterung bezüglich inferenzstatistischer Validität und Interpretierbarkeit. Methoden Alle vorgestellten Ansätze beziehen sich auf unterschiedliche Formen modellbasierter Boosting-Algorithmen. Diese starten bei der Initialisierung mit einem leeren Nullmodell, welches in den nachfolgenden Iterationen schrittweise durch wiederholte Anwendung von Regressionsfunktionen sequentiell erweitert wird um schließlich ein additives Modell zu bilden. Die vorliegende Arbeit untersucht die daraus resultierenden Schätzer und Modelleigenschaften zunächst im Vergleich mit anderen Regularisierungsmethoden wie L1L_1-Penalisierung. Darüber hinaus werden alternative Strategien zur Verbesserung der Variablenselektion des Gesamtmodells vorgeschlagen sowie alternative Teststrategien zur Überprüfung einzelner Effekte entwickelt. Dabei wird vornehmlich auf Varianten der Variablenpermutation und Bootstrapping-Methoden zurückgegriffen. Ergebnisse Die Regularisierung linearer Effektschätzer mittels modellbasierten Gradientenboostings verhält sich im Falle diagonaler Dominanz der inversen Kovarianzmatrix der Prädiktorvariablen mit sinkender Schrittlänge ν\nu asymptotisch zur L1L_1-Penalisierung. Unterschiede zwischen den Verfahren lassen sich auf die sequentielle Aggregation des Boosting-Modells zurückführen, wodurch zwar einerseits die Regularisierungspfade stabilisiert werden, andererseits aber die Modelle tendenziell mehr Variablen aufnehmen. Um eine Vielzahl falsch positiver Selektionen zu vermeiden, kann über die Erweiterung der Daten um permutierte Varianten der Prädiktorvariablen der Fokus von der Prognosegüte auf die Variablenselektion gelenkt werden. Residuenpermutation und parametrischer Bootstrap ermöglichen die Berechnung von p-Werten, die in niedrigdimensionalen Szenarien die gleiche Power erreichen wie Wald-Tests für Maximum-Likelihood-Schätzer. Praktische Schlussfolgerungen Die Ergebnisse dieser Arbeit bieten eine Entscheidungshilfe bei der Wahl zwischen Boosting und L1L_1-Penalisierung als Regularisierungsmethode für statistische Modelle. Zudem wird die Anwendbarkeit modellbasierter Gradient-Boosting-Algorithmen in Situationen verbessert, in denen die weiterführende Interpretation der selektierten Variablen von zentralem Interesse ist. Zum Einen lässt sich die Genauigkeit der Variablenselektion durch alternatives Tuning mittels permutierter Variablen erhöhen. Darüber hinaus erlaubt die Verwendung des parametrischen Bootstraps erstmals die Berechnung von pp-Werten für einzelne Effektschätzer modellbasierter Gradient-Boosting Algorithmen in hochdimensionalen Szenarien mit korrelierten Prädiktorvariablen

    Extending the inferential capabilities of model-based gradient boosting algorithms

    Get PDF
    Hintergrund und Ziele Die rasante technologische Entwicklung der vergangenen Jahrzehnte ermöglichte nicht nur die praktische Anwendung zuvor lediglich theoretischer Konzepte der statistischen Datenanalyse sondern führte auch zu einer Vielzahl neuer, zunehmend rechenintensiven Analysestrategien aus dem Umfeld des maschinellen Lernens. Weiterentwicklungen der erfolgreichen Boosting-Algorithmen offenbarten deren Nähe zu bekannten statistischen Konzepten und machten diese für die Schätzung regularisierter Regressionsparameter additiver Modelle nutzbar. Die vorliegende Arbeit richtet den Fokus auf die daraus resultierenden Modelleigenschaften sowie deren Verbesserung und Erweiterung bezüglich inferenzstatistischer Validität und Interpretierbarkeit. Methoden Alle vorgestellten Ansätze beziehen sich auf unterschiedliche Formen modellbasierter Boosting-Algorithmen. Diese starten bei der Initialisierung mit einem leeren Nullmodell, welches in den nachfolgenden Iterationen schrittweise durch wiederholte Anwendung von Regressionsfunktionen sequentiell erweitert wird um schließlich ein additives Modell zu bilden. Die vorliegende Arbeit untersucht die daraus resultierenden Schätzer und Modelleigenschaften zunächst im Vergleich mit anderen Regularisierungsmethoden wie L1L_1-Penalisierung. Darüber hinaus werden alternative Strategien zur Verbesserung der Variablenselektion des Gesamtmodells vorgeschlagen sowie alternative Teststrategien zur Überprüfung einzelner Effekte entwickelt. Dabei wird vornehmlich auf Varianten der Variablenpermutation und Bootstrapping-Methoden zurückgegriffen. Ergebnisse Die Regularisierung linearer Effektschätzer mittels modellbasierten Gradientenboostings verhält sich im Falle diagonaler Dominanz der inversen Kovarianzmatrix der Prädiktorvariablen mit sinkender Schrittlänge ν\nu asymptotisch zur L1L_1-Penalisierung. Unterschiede zwischen den Verfahren lassen sich auf die sequentielle Aggregation des Boosting-Modells zurückführen, wodurch zwar einerseits die Regularisierungspfade stabilisiert werden, andererseits aber die Modelle tendenziell mehr Variablen aufnehmen. Um eine Vielzahl falsch positiver Selektionen zu vermeiden, kann über die Erweiterung der Daten um permutierte Varianten der Prädiktorvariablen der Fokus von der Prognosegüte auf die Variablenselektion gelenkt werden. Residuenpermutation und parametrischer Bootstrap ermöglichen die Berechnung von p-Werten, die in niedrigdimensionalen Szenarien die gleiche Power erreichen wie Wald-Tests für Maximum-Likelihood-Schätzer. Praktische Schlussfolgerungen Die Ergebnisse dieser Arbeit bieten eine Entscheidungshilfe bei der Wahl zwischen Boosting und L1L_1-Penalisierung als Regularisierungsmethode für statistische Modelle. Zudem wird die Anwendbarkeit modellbasierter Gradient-Boosting-Algorithmen in Situationen verbessert, in denen die weiterführende Interpretation der selektierten Variablen von zentralem Interesse ist. Zum Einen lässt sich die Genauigkeit der Variablenselektion durch alternatives Tuning mittels permutierter Variablen erhöhen. Darüber hinaus erlaubt die Verwendung des parametrischen Bootstraps erstmals die Berechnung von pp-Werten für einzelne Effektschätzer modellbasierter Gradient-Boosting Algorithmen in hochdimensionalen Szenarien mit korrelierten Prädiktorvariablen.Background and aims The rapid development of computer technology in recent decades has not only enabled the practical application of previously merely theoretical ideas of statistical data analysis but has also led to a multitude of new and increasingly computationally intensive analysis strategies emerging from the field of machine learning. Further developments of the successful boosting algorithms revealed their relationship to known statistical concepts and made them usable for the estimation of regularized regression parameters of additive models. This thesis focuses on the resulting model properties as well as their improvement and extension with regard to inferential statistical validity and interpretability. Methods All presented approaches address various forms of model-based boosting algorithms. The algorithm is initialized with an empty model, which is sequentially updated in the following iterations by repeated application of small regression functions to build a final additive model. This thesis examines the resulting estimators and model properties in comparison with other regularization methods such as L1L_1-penalization. In addition, alternative strategies for improving the variable selection properties of the overall model are proposed and strategies for testing individual effects are developed. For this purpose, variants of variable permutation and bootstrapping methods are developed. Results Regularization of linear effect estimators by means of model-based gradient boostings exhibits asymptotic behaviour to L1L_1-penalization with decreasing learning rate ν\nu if and only if the inverse covariance matrix of the predictor variables is diagonally dominant. Differences between the methods can be traced back to the sequential aggregation of the boosting model, which stabilizes the regularization paths but makes the models relatively larger. Therefore, in order to avoid a large number of false positive selections, the focus can be shifted from the prediction to variable section accuracy by extending the dataset with permutations of the predictor variables. Residual permutation and parametric bootstrap allow the computation of p-values with test power on par with Wald-tests for maximum likelihood estimators in low-dimensional scenarios. Practical conclusions The results of this work provide a guideline for the choice between boosting and L1L_1-penalty as regularization method for statistical models. In addition, the applicability of model-based gradient-boosting algorithms is improved in situations where more detailled interpretation of the selected variables is of central interest. The reliability of true informative value of selected variables is increased by using alternative tuning via permuted variables. Moreover, making use of the parametric bootstrap allows for the first time the calculation of p-values for single effect estimators of gradient boosting algorithms in high dimensional scenarios with correlated predictor variables

    An update on statistical boosting in biomedicine

    Get PDF
    Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine-learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine

    MedGAN: Medical Image Translation using GANs

    Full text link
    Image-to-image translation is considered a new frontier in the field of medical image analysis, with numerous potential applications. However, a large portion of recent approaches offers individualized solutions based on specialized task-specific architectures or require refinement through non-end-to-end training. In this paper, we propose a new framework, named MedGAN, for medical image-to-image translation which operates on the image level in an end-to-end manner. MedGAN builds upon recent advances in the field of generative adversarial networks (GANs) by merging the adversarial framework with a new combination of non-adversarial losses. We utilize a discriminator network as a trainable feature extractor which penalizes the discrepancy between the translated medical images and the desired modalities. Moreover, style-transfer losses are utilized to match the textures and fine-structures of the desired target images to the translated images. Additionally, we present a new generator architecture, titled CasNet, which enhances the sharpness of the translated medical outputs through progressive refinement via encoder-decoder pairs. Without any application-specific modifications, we apply MedGAN on three different tasks: PET-CT translation, correction of MR motion artefacts and PET image denoising. Perceptual analysis by radiologists and quantitative evaluations illustrate that the MedGAN outperforms other existing translation approaches.Comment: 16 pages, 8 figure

    ipA-MedGAN: Inpainting of Arbitrary Regions in Medical Imaging

    Full text link
    Local deformations in medical modalities are common phenomena due to a multitude of factors such as metallic implants or limited field of views in magnetic resonance imaging (MRI). Completion of the missing or distorted regions is of special interest for automatic image analysis frameworks to enhance post-processing tasks such as segmentation or classification. In this work, we propose a new generative framework for medical image inpainting, titled ipA-MedGAN. It bypasses the limitations of previous frameworks by enabling inpainting of arbitrary shaped regions without a prior localization of the regions of interest. Thorough qualitative and quantitative comparisons with other inpainting and translational approaches have illustrated the superior performance of the proposed framework for the task of brain MR inpainting.Comment: Submitted to IEEE ICIP 202

    Adaptive step-length selection in gradient boosting for Gaussian location and scale models

    Get PDF
    Tuning of model-based boosting algorithms relies mainly on the number of iterations, while the step-length is fixed at a predefined value. For complex models with several predictors such as Generalized additive models for location, scale and shape (GAMLSS), imbalanced updates of predictors, where some distribution parameters are updated more frequently than others, can be a problem that prevents some submodels to be appropriately fitted within a limited number of boosting iterations. We propose an approach using adaptive step-length (ASL) determination within a non-cyclical boosting algorithm for Gaussian location and scale models, as an important special case of the wider class of GAMLSS, to prevent such imbalance. Moreover, we discuss properties of the ASL and derive a semi-analytical form of the ASL that avoids manual selection of the search interval and numerical optimization to find the optimal step-length, and consequently improves computational efficiency. We show competitive behavior of the proposed approaches compared to penalized maximum likelihood and boosting with a fixed step-length for Gaussian location and scale models in two simulations and two applications, in particular for cases of large variance and/or more variables than observations. In addition, the underlying concept of the ASL is also applicable to the whole GAMLSS framework and to other models with more than one predictor like zero-inflated count models, and brings up insights into the choice of the reasonable defaults for the step-length in the simpler special case of (Gaussian) additive models.Volkswagen Foundation http://dx.doi.org/10.13039/50110000166

    Probing for Sparse and Fast Variable Selection with Model-Based Boosting

    Get PDF
    We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of the fitting lies in the need of multiple model fits on slightly altered data (e.g., cross-validation or bootstrap) to find the optimal number of boosting iterations and prevent overfitting. In our proposed approach, we augment the data set with randomly permuted versions of the true variables, so-called shadow variables, and stop the stepwise fitting as soon as such a variable would be added to the model. This allows variable selection in a single fit of the model without requiring further parameter tuning. We show that our probing approach can compete with state-of-the-art selection methods like stability selection in a high-dimensional classification benchmark and apply it on three gene expression data sets

    Mixture density networks for the indirect estimation of reference intervals

    Get PDF
    Background Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step. Results Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data. Conclusions Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications
    • …
    corecore