7 research outputs found
Fair Adaptive Experiments
Randomized experiments have been the gold standard for assessing the
effectiveness of a treatment or policy. The classical complete randomization
approach assigns treatments based on a prespecified probability and may lead to
inefficient use of data. Adaptive experiments improve upon complete
randomization by sequentially learning and updating treatment assignment
probabilities. However, their application can also raise fairness and equity
concerns, as assignment probabilities may vary drastically across groups of
participants. Furthermore, when treatment is expected to be extremely
beneficial to certain groups of participants, it is more appropriate to expose
many of these participants to favorable treatment. In response to these
challenges, we propose a fair adaptive experiment strategy that simultaneously
enhances data use efficiency, achieves an envy-free treatment assignment
guarantee, and improves the overall welfare of participants. An important
feature of our proposed strategy is that we do not impose parametric modeling
assumptions on the outcome variables, making it more versatile and applicable
to a wider array of applications. Through our theoretical investigation, we
characterize the convergence rate of the estimated treatment effects and the
associated standard deviations at the group level and further prove that our
adaptive treatment assignment algorithm, despite not having a closed-form
expression, approaches the optimal allocation rule asymptotically. Our proof
strategy takes into account the fact that the allocation decisions in our
design depend on sequentially accumulated data, which poses a significant
challenge in characterizing the properties and conducting statistical inference
of our method. We further provide simulation evidence to showcase the
performance of our fair adaptive experiment strategy
Assessing Heterogeneous Risk of Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data
There have been increased concerns that the use of statins, one of the most
commonly prescribed drugs for treating coronary artery disease, is potentially
associated with the increased risk of new-onset type II diabetes (T2D).
However, because existing clinical studies with limited sample sizes often
suffer from selection bias issues, there is no robust evidence supporting as to
whether and what kind of populations are indeed vulnerable for developing T2D
after taking statins. In this case study, building on the biobank and
electronic health record data in the Partner Health System, we introduce a new
data analysis pipeline from a biological perspective and a novel statistical
methodology that address the limitations in existing studies to: (i)
systematically examine heterogeneous treatment effects of stain use on T2D
risk, (ii) uncover which patient subgroup is most vulnerable to T2D after
taking statins, and (iii) assess the replicability and statistical significance
of the most vulnerable subgroup via bootstrap calibration. Our proposed
bootstrap calibration approach delivers asymptotically sharp confidence
intervals and debiased estimates for the treatment effect of the most
vulnerable subgroup in the presence of possibly high-dimensional covariates. By
implementing our proposed approach, we find that females with high T2D genetic
risk at baseline are indeed at high risk of developing T2D due to statin use,
which provides evidences to support future clinical decisions with respect to
statin use.Comment: 31 pages, 2 figures, 6 table
Recommended from our members
Efficient targeted learning of heterogeneous treatment effects for multiple subgroups.
In biomedical science, analyzing treatment effect heterogeneity plays an essential role in assisting personalized medicine. The main goals of analyzing treatment effect heterogeneity include estimating treatment effects in clinically relevant subgroups and predicting whether a patient subpopulation might benefit from a particular treatment. Conventional approaches often evaluate the subgroup treatment effects via parametric modeling and can thus be susceptible to model mis-specifications. In this paper, we take a model-free semiparametric perspective and aim to efficiently evaluate the heterogeneous treatment effects of multiple subgroups simultaneously under the one-step targeted maximum-likelihood estimation (TMLE) framework. When the number of subgroups is large, we further expand this path of research by looking at a variation of the one-step TMLE that is robust to the presence of small estimated propensity scores in finite samples. From our simulations, our method demonstrates substantial finite sample improvements compared to conventional methods. In a case study, our method unveils the potential treatment effect heterogeneity of rs12916-T allele (a proxy for statin usage) in decreasing Alzheimers disease risk
The Structural Violence Trap: Disparities in Homicide, Chronic Disease Death, and Social Factors Across San Francisco Neighborhoods.
BACKGROUND: On average, a person living in San Francisco can expect to live 83 years. This number conceals significant variation by sex, race, and place of residence. We examined deaths and area-based social factors by San Francisco neighborhood, hypothesizing that socially disadvantaged neighborhoods shoulder a disproportionate mortality burden across generations, especially deaths attributable to violence and chronic disease. These data will inform targeted interventions and guide further research into effective solutions for San Franciscos marginalized communities. STUDY DESIGN: The San Francisco Department of Public Health provided data for the 2010-2014 top 20 causes of premature death by San Francisco neighborhood. Population-level demographic data were obtained from the US American Community Survey 2015 5-year estimate (2011-2015). The primary outcome was the association between years of life loss (YLL) and adjusted years of life lost (AYLL) for the top 20 causes of death in San Francisco and select social factors by neighborhood via linear regression analysis and heatmaps. RESULTS: The top 20 causes accounted for N = 15,687 San Francisco resident deaths from 2010-2014. Eight neighborhoods (21.0%) accounted for 47.9% of city-wide YLLs, with 6 falling below the city-wide median household income and many having a higher percent population Black, and lower education and higher unemployment levels. For chronic diseases and homicides, AYLLs increased as a neighborhoods percent Black, below poverty level, unemployment, and below high school education increased. CONCLUSIONS: Our study highlights the mortality inequity burdening socially disadvantaged San Francisco neighborhoods, which align with areas subjected to historical discriminatory policies like redlining. These data emphasize the need to address past injustices and move toward equal access to wealth and health for all San Franciscans
Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data
There have been increased concerns that the use of statins, one of the most commonly prescribed drugs for treating coronary artery disease, is potentially associated with the increased risk of new-onset type II diabetes (T2D). Nevertheless, to date, there is no robust evidence supporting as to whether and what kind of populations are indeed vulnerable for developing T2D after taking statins. In this case study, leveraging the biobank and electronic health record data in the Partner Health System, we introduce a new data analysis pipeline and a novel statistical methodology that address existing limitations by (i) designing a rigorous causal framework that systematically examines the causal effects of statin usage on T2D risk in observational data, (ii) uncovering which patient subgroup is most vulnerable for developing T2D after taking statins, and (iii) assessing the replicability and statistical significance of the most vulnerable subgroup via a bootstrap calibration procedure. Our proposed approach delivers asymptotically sharp confidence intervals and debiased estimate for the treatment effect of the most vulnerable subgroup in the presence of high-dimensional covariates. With our proposed approach, we find that females with high T2D genetic risk are at the highest risk of developing T2D due to statin usage.</p