800 research outputs found

    Invariant and Metric Free Proximities for Data Matching: An R Package

    Get PDF
    Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. In this paper we present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. An open-source software in the form of a R package is also presented.

    Multiple imputation of large scale complex surveys

    Get PDF

    Imputation with the R Package VIM

    Get PDF
    The package VIM is developed to explore and analyze the structure of missing values in data using visualization methods, to impute these missing values with the built-in imputation methods and to verify the imputation process using visualization tools, as well as to produce high-quality graphics for publications. This article focuses on the different imputation techniques available in the package. Four different imputation methods are currently implemented in VIM, namely hot-deck imputation, k-nearest neighbor imputation, regression imputation and iterative robust model-based imputation. All of these methods are implemented in a flexible manner with many options for customization. Furthermore in this article practical examples are provided to highlight the use of the implemented methods on real-world applications. In addition, the graphical user interface of VIM has been re-implemented from scratch resulting in the package VIMGUI to enable users without extensive R skills to access these imputation and visualization methods

    Invariant and Metric Free Proximities for Data Matching: An R Package

    Get PDF
    Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. In this paper we present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. An open-source software in the form of a R package is also presented

    Baseline predictors of treatment outcome in Internet-based alcohol interventions: a recursive partitioning analysis alongside a randomized trial

    Get PDF
    BACKGROUND: Internet-based interventions are seen as attractive for harmful users of alcohol and lead to desirable clinical outcomes. Some participants will however not achieve the desired results. In this study, harmful users of alcohol have been partitioned in subgroups with low, intermediate or high probability of positive treatment outcome, using recursive partitioning classification tree analysis. METHODS: Data were obtained from a randomized controlled trial assessing the effectiveness of two Internet-based alcohol interventions. The main outcome variable was treatment response, a dichotomous outcome measure for treatment success. Candidate predictors for the classification analysis were first selected using univariate regression. Next, a tree decision model to classify participants in categories with a low, medium and high probability of treatment response was constructed using recursive partitioning software. RESULTS: Based on literature review, 46 potentially relevant baseline predictors were identified. Five variables were selected using univariate regression as candidate predictors for the classification analysis. Two variables were found most relevant for classification and selected for the decision tree model: ‘living alone’, and ‘interpersonal sensitivity’. Using sensitivity analysis, the robustness of the decision tree model was supported. CONCLUSIONS: Harmful alcohol users in a shared living situation, with high interpersonal sensitivity, have a significantly higher probability of positive treatment outcome. The resulting decision tree model may be used as part of a decision support system but is on its own insufficient as a screening algorithm with satisfactory clinical utility. TRIAL REGISTRATION: Netherlands Trial Register (Cochrane Collaboration): NTR-TC1155
    • …
    corecore