5 research outputs found
Data Integration Approaches to Estimate Heterogenous Treatment Effects
Clinicians and practitioners are often motivated to determine which treatment would work best for a given individual based on their observed characteristics, but doing so can be challenging because sample sizes are typically not large enough, and the variables involved in the true treatment effect heterogeneity are often unknown. To better understand treatment effect heterogeneity, researchers can rely on combining information from multiple sources, e.g., multiple randomized controlled trials (RCTs), or RCTs in conjunction with observational datasets. However, combining data requires taking into account that the data comes from heterogeneous sources, and different sources might have different settings, potential biases, and site-level characteristics that can impact treatment effects. This dissertation discusses approaches for integrating multiple datasets to estimate heterogeneous treatment effects. Previous approaches are outlined, and new methods are developed and introduced to estimate the conditional average treatment effect function across multiple trials and in a target population. The methods used are primarily non-parametric but compared to parametric meta-analysis. Methods are applied to real data comparing treatments for major depression to investigate potential heterogeneity of the treatment effect in this setting
Comparing Machine Learning Methods for Estimating Heterogeneous Treatment Effects by Combining Data from Multiple Randomized Controlled Trials
Individualized treatment decisions can improve health outcomes, but using
data to make these decisions in a reliable, precise, and generalizable way is
challenging with a single dataset. Leveraging multiple randomized controlled
trials allows for the combination of datasets with unconfounded treatment
assignment to improve the power to estimate heterogeneous treatment effects.
This paper discusses several non-parametric approaches for estimating
heterogeneous treatment effects using data from multiple trials. We extend
single-study methods to a scenario with multiple trials and explore their
performance through a simulation study, with data generation scenarios that
have differing levels of cross-trial heterogeneity. The simulations demonstrate
that methods that directly allow for heterogeneity of the treatment effect
across trials perform better than methods that do not, and that the choice of
single-study method matters based on the functional form of the treatment
effect. Finally, we discuss which methods perform well in each setting and then
apply them to four randomized controlled trials to examine effect heterogeneity
of treatments for major depressive disorder
Methods for Integrating Trials and Non-Experimental Data to Examine Treatment Effect Heterogeneity
Estimating treatment effects conditional on observed covariates can improve
the ability to tailor treatments to particular individuals. Doing so
effectively requires dealing with potential confounding, and also enough data
to adequately estimate effect moderation. A recent influx of work has looked
into estimating treatment effect heterogeneity using data from multiple
randomized controlled trials and/or observational datasets. With many new
methods available for assessing treatment effect heterogeneity using multiple
studies, it is important to understand which methods are best used in which
setting, how the methods compare to one another, and what needs to be done to
continue progress in this field. This paper reviews these methods broken down
by data setting: aggregate-level data, federated learning, and individual
participant-level data. We define the conditional average treatment effect and
discuss differences between parametric and nonparametric estimators, and we
list key assumptions, both those that are required within a single study and
those that are necessary for data combination. After describing existing
approaches, we compare and contrast them and reveal open areas for future
research. This review demonstrates that there are many possible approaches for
estimating treatment effect heterogeneity through the combination of datasets,
but that there is substantial work to be done to compare these methods through
case studies and simulations, extend them to different settings, and refine
them to account for various challenges present in real data
Estimation of place-based vulnerability scores for HIV viral non-suppression: an application leveraging data from a cohort of people with histories of using drugs
Abstract The relationships between place (e.g., neighborhood) and HIV are commonly investigated. As measurements of place are multivariate, most studies apply some dimension reduction, resulting in one variable (or a small number of variables), which is then used to characterize place. Typical dimension reduction methods seek to capture the most variance of the raw items, resulting in a type of summary variable we call “disadvantage score”. We propose to add a different type of summary variable, the “vulnerability score,” to the toolbox of the researchers doing place and HIV research. The vulnerability score measures how place, as known through the raw measurements, is predictive of an outcome. It captures variation in place characteristics that matters most for the particular outcome. We demonstrate the estimation and utility of place-based vulnerability scores for HIV viral non-suppression, using data with complicated clustering from a cohort of people with histories of injecting drugs