16 research outputs found

    How balance and sample size impact bias in the estimation of causal treatment effects: a simulation study

    Get PDF
    Observational studies are often used to understand relationships between exposures and outcomes. They do not, however, allow conclusions about causal relationships to be drawn unless statistical techniques are used to account for the imbalance of confounders across exposure groups. Propensity score and balance weighting (PSBW) are useful techniques that aim to reduce the imbalances between exposure groups by weighting the groups to look alike on the observed confounders. Despite the plethora of available methods to estimate PSBW, there is little guidance on what one defines as adequate balance, and unbiased and robust estimation of the causal treatment effect is not guaranteed unless several conditions hold. Accurate inference requires that 1. the treatment allocation mechanism is known, 2. the relationship between the baseline covariates and the outcome is known, 3. adequate balance of baseline covariates is achieved post-weighting, 4. a proper set of covariates to control for confounding bias is known, and 5. a large enough sample size is available. In this article, we use simulated data of various sizes to investigate the influence of these five factors on statistical inference. Our findings provide evidence that the maximum Kolmogorov- Smirnov statistic is the proper statistical measure to assess balance on the baseline covariates, in contrast to the mean standardised mean difference used in many applications, and 0.1 is a suitable threshold to consider as acceptable balance. Finally, we recommend that 60-80 observations, per confounder per treatment group, are required to obtain a reliable and unbiased estimation of the causal treatment effect

    CoBWeb: a user-friendly web application to estimate causal treatment effects from observational data using multiple algorithms

    Get PDF
    Background/aims: While randomized controlled trials are the gold standard for measuring causal effects, robust conclusions about causal relationships can be obtained using data from observational studies if proper statistical techniques are used to account for the imbalance of pretreatment confounders across groups. Propensity score (PS) and balance weighting are useful techniques that aim to reduce the observed imbalances between treatment groups by weighting the groups to be as similar as possible with respect to observed confounders. Methods: We have created CoBWeb, a free and easy-to-use web application for the estimation of causal treatment effects from observational data, using PS and balancing weights to control for confounding bias. CoBWeb uses multiple algorithms to estimate the PS and balancing weights, to allow for more flexible relations between the treatment indicator and the observed confounders (as different algorithms make different (or no) assumptions about the structural relationship between the treatment covariate and the confounders). The optimal algorithm can be chosen by selecting the one that achieves the best trade-off between balance and effective sample size. Results: CoBWeb follows all the key steps required for robust estimation of the causal treatment effect from observational study data and includes sensitivity analysis of the potential impact of unobserved confounders. We illustrate the practical use of the app using a dataset derived from a study of an intervention for adolescents with substance use disorder, which is available for users within the app environment. Conclusion: CoBWeb is intended to enable non-specialists to understand and apply all the key steps required to perform robust estimation of causal treatment effects using observational data

    A tutorial comparing different covariate balancing methods with an application evaluating the causal effect of exercise on the progression of Huntington’s disease

    Get PDF
    Randomized controlled trials are the gold standard for measuring the causal effects of treatments on clinical outcomes. However, randomized trials are not always feasible, and causal treatment effects must, therefore, often be inferred from observational data. Observational study designs do not allow conclusions about causal relationships to be drawn unless statistical techniques are used to account for the imbalance of confounders across groups while key assumptions hold. Propensity score (PS) and balance weighting are two useful techniques that aim to reduce the imbalances between treatment groups by weighting the groups to look alike on the observed confounders. There are many methods available to estimate PSand balancing weights. However, it is unclear a priori which will achieve the best trade-off between covariate balance and effective sample size. Weighted analyses are further complicated by small studies with limited sample sizes, which is common when studying rare diseases. To address these issues, we present a step-by-step guide to covariate balancing strategies, including how to evaluate overlap, obtain estimates of PS and balancing weights, check for covariate balance, and assess sensitivity to unobserved confounding. We compare the performance of a number of commonly used estimation methods on a synthetic data set based on the Physical Activity and Exercise Outcomes in Huntington Disease (PACE-HD) study, which explored whether enhanced physical activity affects the progression and severity of the disease. We provide general guidelines for the choice of method for estimation of PS and balancing weights, interpretation, and sensitivity analysis of results. We also present R code for implementing the different methods and assessing balanc

    A tutorial comparing different covariate balancing methods with an application evaluating the causal effects of substance use treatment programs for adolescents

    Get PDF
    Randomized controlled trials are the gold standard for measuring causal effects. However, they are often not always feasible, and causal treatment effects must be estimated from observational data. Observational studies do not allow robust conclusions about causal relationships unless statistical techniques account for the imbalance of pretreatment confounders across groups and key assumptions hold. Propensity score and balance weighting (PSBW) are useful techniques that aim to reduce the observed imbalances between treatment groups by weighting the groups to look alike on the observed confounders. Notably, there are many methods available to estimate PSBW. However, it is unclear a priori which will achieve the best trade-off between covariate balance and effective sample size for a given application. Moreover, it is critical to assess the validity of key assumptions required for robust estimation of the needed treatment effects, including the overlap and no unmeasured confounding assumptions. We present a step-by-step guide to the use of PSBW for estimation of causal treatment effects that includes steps on how to evaluate overlap before the analysis, obtain estimates of PSBW using multiple methods and select the optimal one, check for covariate balance on multiple metrics, and assess sensitivity of findings (both the estimated treatment effect and statistical significance) to unobserved confounding. We illustrate the key steps using a case study examining the relative effectiveness of substance use treatment programs and provide a user-friendly Shiny application that can implement the proposed steps for any application with binary treatments

    Statistical methods for the identification and modelling of lifestyle factors related to Huntington’s Disease severity and progression

    No full text
    Randomised controlled trials (RCTs) are considered the gold standard to estimate the effect of interventions on the progression of diseases. Observational studies are gaining traction to study the effect of modifiable factors on the progression and severity of rare diseases, as they are often easier and cheaper to implement than RCTs. Observational studies do not require tailored interventions; rather, they involve observing the behaviour of the participants and monitoring their progression. However, making inferences using data from observational studies can be challenging for several reasons, including imbalances in confounders between the control and treatment groups. In this thesis, I use observational data in an attempt to identify risk factors for Huntington’s disease (HD) severity and progression. HD is an inherited disorder that results in the death of brain cells and typically leads to death 15-20 years after clinical diagnosis. It is a rare disease (prevalence 4-8 per 100,000). I began by proposing a set of steps to make inferences using data from observational studies, controlling for confounding bias using Propensity Score and Balancing Weights (PSBW), and developing a web application (CoBWeb), which implements these steps in a user-friendly environment. Next, I used a well established simulation study to understand how the limited sample size (typical for rare diseases), the choice of baseline covariates for balancing could affect the ability to control for confounding bias. Finally, I applied these methods to the Enroll-HD dataset to investigate the effect of several modifiable lifestyle factors (e.g., use of antidepressant medication) on the progression of HD. Analysis of Enroll-HD data, using the proposed methodology, identified a potentially harmful effect of antidepressant medication on the progression of HD among those in the early stage of the disease

    Ordinary and Bayesian LASSO for Regression Models

    No full text
    Η παρούσα διπλωματική εργασία πραγματεύεται τις ιδιότητες του Κλασσικού και Μπεϋζιανού LASSO σε Μοντέλα Παλινδρόμησης. Ο Τελεστής Απόλυτης Συρρίκνωσης και Επιλογής (LASSO), είναι μία μέθοδος που συρρικνώνει τους συντελεστές ενός μοντέλου και ταυτόχρονα επιλέγει τις σημαντικές επεξηγητικές μεταβλητές ανάμεσα σε ένα σύνολο από πολλές μεταβλητές. Στο πλαίσιο του κανονικού γραμμικού μοντέλου, η μέθοδος βασίζεται σε αλγορίθμους κυρτού προγραμματισμού υπό τον περιορισμό (ποινή) της νόρμας, που πραγματοποιεί επιλογή μεταβλητών μέσω της συρρίκνωσης των μη στατιστικά σημαντικών μεταβλητών ακριβώς στο μηδέν. Στο πλαίσιο του COX μοντέλου, η μέθοδος βασίζεται στην μεγιστοποίηση της πιθανοφάνειας του μοντέλου, και πάλι υπό τον περιορισμό (ποινή) της νόρμας που έχει το ίδιο αποτέλεσμα όπως στην περίπτωση της γραμμικής παλινδρόμησης. Στο Μπεϋζιανό πλαίσιο, παίρνουμε τους εκτιμητές LASSO ως την κορυφή της εκ των υστέρων κατανομής κατά Μπέυς, όταν επιλέξουμε ως εκ των προτέρων κατανομή για τις μεταβλητές τη Διπλή-Εκθετική, ανεξάρτητα για κάθε μεταβλητή. Το LASSO προτάθηκε από τον Robert Tibshirani (1996) και αναπτύχθηκε περαιτέρω από τους Efron et al. (2004), οι οποίοι πρότειναν τον αλγόριθμο LARS ως επαρκή αλγόριθμο για τον υπολογισμό του LASSO. Το LASSO για το COX μοντέλο προτάθηκε από τον Tibshirani (1997) και η λύση του προκύπτει από μία παραλλαγή του αλγορίθμου Newton-Raphson. Για το Μπεϋζιανό LASSO στην κανονική γραμμική παλινδρόμηση, ο Tibshirani (1996) πρότεινε μια εκ των προτέρων κατανομή που είναι ανάλογη του λογαρίθμου της πυκνότητας της Διπλής-Εκθετικής κατανομής, και στη συνέχεια προτάθηκαν πολλά μοντέλα, από τους Park και Casella (2008), Ντζούφρας και Λύκου (2012) και πολλούς άλλους, καθώς το Μπεϋζιανό LASSO αποτελεί ένα ενεργό τομέα της Μπεϋζιανής Στατιστικής. Στο κεφάλαιο 1 περιγράφεται το πολλαπλό γραμμικό μοντέλο, οι μέθοδοι ελαχίστων τετραγώνων και μέγιστης πιθανοφάνειας, καθώς και μέθοδοι ελέγχου καλής προσαρμογής και επιλογής μεταβλητών. Στην συνέχεια περιγράφουμε τη μέθοδο LASSO για το γραμμικό μοντέλο (κεφάλαιο 2) και το μοντέλο COX (κεφάλαιο 3). Στο κεφάλαιο 4, παραθέτουμε τις αρχές της Μπεϋζιανής Θεωρίας και των αλγορίθμων MCMC, ενώ στο κεφάλαιο 5 το Μπεϋζιανό LASSO στο πλαίσιο του γραμμικού μοντέλου. Τέλος, στα κεφάλαια 6 και 7, παρουσιάζουμε και αναλύουμε τα αποτελέσματα των εφαρμογών των μεθόδων που παρουσιάστηκαν στα προηγούμενα κεφάλαια σε προσομοιωμένα (κεφάλαιο 6) και πραγματικά (κεφάλαιο 7) δεδομένα.This dissertation is concerned with the properties of Ordinary and Bayesian LASSO in Regression Models. The Least Absolute Shrinkage and Selection Operator (LASSO) is a method that performs concurrently shrinking of the coefficients of a model and selects important predictors among a large set of covariates. In the context of normal linear regression, it relies on con- vex programming algorithms including a penalty which performs selection through shrinking exactly to zero the coefficients of unimportant covari- ates. In the context of COX models, it relies on maximizing the models’ likelihood including again a penalty with the same effect as in the case of linear regression. In the framework of Bayesian Inference, we can de- rive the LASSO estimate as the Bayes posterior mode under independent double-expotential priors for the regressors. The LASSO was originally introduced by Robert Tibshirani (1996) and was further developed by Efron et al. (2004), who proposed the LARS, an efficient algorithm to compute the entire LASSO path. Ordinary LASSO for COX models was proposed by Tibshirani (1997) and its solution was obtained by a modification of the Newton-Raphson Algorithm. For the Bayesian LASSO in normal linear regression, Tibshirani (1996) proposed a prior proportional to minus the log-density of the double expotential dis- tribution, and then a plethora of model structures have been proposed, by Park and Casella (2008), Ntzoufras and Lykou (2012) and many others, since Bayesian LASSO is an active field in Bayesian statistics

    Covariate balancing & weighting web app (COBWEB): an online tool simplifying robust causal inference in observational studies

    No full text
    Background Observational study impose challenges to make conclusions about causal relationships, requiring the use of statistical techniques to account for imbalance of confounders between treatment groups. Propensity score and balance weighting (PSBW) are useful techniques that aim to reduce these imbalances by weighting the groups to be as similar as possible on the observed confounders. Aims Although here are many methods available to perform PSBW, there is little guidance on their implementation on small sample sizes, which are a common limiting factor in HD research. Motivated by the Physical Activity and Exercise Outcomes in Huntington’s Disease (PACE-HD) study, which evaluated the impact of enhanced physical activity on the progression and severity of the disease, we explored the challenges of performing PSBW analysis with small sample sizes. Methods We have designed a user-friendly online tool, called the Covariate Balancing & Weighting Web App (CoBWeb), to enable non-specialist researchers to estimate the causal effect of treatment from observational data while minimising confounding bias using PSBW. Outcome The app implements the following five key steps: 1) evaluate overlap of the treatment groups, 2) obtain estimates of PSBW using multiple methods, 3) check for covariate balance and select the best performing method, 4) estimate the causal treatment effect, and 5) assess sensitivity to unobserved confounding, and comes with a tutorial using simulated data based on PACE-HD
    corecore