1,022 research outputs found

    Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

    Get PDF
    Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance to personalize treatment or pricing. We investigate how to estimate price sensitivity from transaction-level data. In causal inference terms, we estimate heterogeneous treatment effects when (a) the response to treatment (here, whether a customer buys a product) is binary, and (b) treatment assignments are partially observed (here, full information is only available for purchased items). Methodology/Results: We propose a recursive partitioning procedure to estimate heterogeneous odds ratio, a widely used measure of treatment effect in medicine and social sciences. We integrate an adversarial imputation step to allow for robust inference even in presence of partially observed treatment assignments. We validate our methodology on synthetic data and apply it to three case studies from political science, medicine, and revenue management. Managerial Implications: Our robust heterogeneous odds ratio estimation method is a simple and intuitive tool to quantify heterogeneity in patients or customers and personalize interventions, while lifting a central limitation in many revenue management data

    Young Swiss men's risky single-occasion drinking: Identifying those who do not respond to stricter alcohol policy environments

    Full text link
    BACKGROUND Previous research has demonstrated a preventive effect of the alcohol policy environment on alcohol consumption. However, little is known about the heterogeneity of this effect. Our aim was to examine the extent of heterogeneity in the relationship between the strictness of alcohol policy environments and heavy drinking and to identify potential moderators of the relationship. METHODS Cross-sectional data from 5986 young Swiss men participating in the cohort study on substance use risk factors (C-SURF) were analysed. The primary outcome was self-reported risky single-occasion drinking in the past 12 months (RSOD, defined as 6 standard drinks or more on a single occasion at least monthly). A previously-used index of alcohol policy environment strictness across Swiss cantons was analysed in conjunction with 21 potential moderator variables. Random forest machine learning captured high-dimensional interaction effects, while individual conditional expectations captured the heterogeneity induced by the interaction effects and identified moderators. RESULTS Predicted subject-specific absolute risk reductions in RSOD risk ranged from 16.8% to - 4.2%, indicating considerable heterogeneity. Sensation seeking and antisocial personality disorder (ASPD) were major moderators that reduced the preventive relationship between stricter alcohol policy environments and RSOD risk. They also were associated with the paradoxical observation that some individuals displayed increased RSOD risk in stricter alcohol policy environments. CONCLUSION Whereas stricter alcohol policy environments were associated with reduced average RSOD risk, additionally addressing the risk conveyed by sensation seeking and ASPD would deliver an interlocking prevention mix against young Swiss men's RSOD

    An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

    Get PDF
    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing

    Development and validation of a prognostic model for the early identification of COVID-19 patients at risk of developing common long COVID symptoms

    Full text link
    Background: The coronavirus disease 2019 (COVID-19) pandemic demands reliable prognostic models for estimating the risk of long COVID. We developed and validated a prediction model to estimate the probability of known common long COVID symptoms at least 60 days after acute COVID-19. Methods: The prognostic model was built based on data from a multicentre prospective Swiss cohort study. Included were adult patients diagnosed with COVID-19 between February and December 2020 and treated as outpatients, at ward or intensive/intermediate care unit. Perceived long-term health impairments, including reduced exercise tolerance/reduced resilience, shortness of breath and/or tiredness (REST), were assessed after a follow-up time between 60 and 425 days. The data set was split into a derivation and a geographical validation cohort. Predictors were selected out of twelve candidate predictors based on three methods, namely the augmented backward elimination (ABE) method, the adaptive best-subset selection (ABESS) method and model-based recursive partitioning (MBRP) approach. Model performance was assessed with the scaled Brier score, concordance c statistic and calibration plot. The final prognostic model was determined based on best model performance. Results: In total, 2799 patients were included in the analysis, of which 1588 patients were in the derivation cohort and 1211 patients in the validation cohort. The REST prevalence was similar between the cohorts with 21.6% (n = 343) in the derivation cohort and 22.1% (n = 268) in the validation cohort. The same predictors were selected with the ABE and ABESS approach. The final prognostic model was based on the ABE and ABESS selected predictors. The corresponding scaled Brier score in the validation cohort was 18.74%, model discrimination was 0.78 (95% CI: 0.75 to 0.81), calibration slope was 0.92 (95% CI: 0.78 to 1.06) and calibration intercept was -0.06 (95% CI: -0.22 to 0.09). Conclusion: The proposed model was validated to identify COVID-19-infected patients at high risk for REST symptoms. Before implementing the prognostic model in daily clinical practice, the conduct of an impact study is recommended. Keywords: Clinical prediction model; Long COVID; Prognostic factors; Stratified medicin
    corecore