186 research outputs found
Recommended from our members
Causal machine learning for reliable real-world evidence generation in healthcare
Real-world evidence (RWE) plays a crucial role in understanding the impact of medical interventions and uncovering disparities in clinical practice. However, confounding bias, especially unmeasured confounding, poses challenges to inferring causal relationships from observational data, such as estimating treatment effects and treatment responses. Various methods have been developed to reduce confounding bias, including methods specific for detecting and adjusting for unmeasured confounding. However, these methods typically rely on assumptions that are either untestable or too strong to hold in practice. Some methods also require domain knowledge that is rarely available in medicine. Despite recent advances in method development, the challenge of unmeasured confounding in observational studies persists.
This dissertation provides insights in adjusting for unmeasured confounding by exploiting correlations within electronic health records (EHRs). In Aim 1, we demonstrate a novel use of probabilistic model for inferring unmeasured confounders from drug co-prescription pattern. In Aim 2, we provide theoretical justifications and empirical evidence that adjusting for all (pre-treatment) covariates without explicitly selecting for confounders, as implemented in the large-scale propensity score (LSPS) method, offers a more robust approach to mitigating unmeasured confounding.
In Aim 3, we shift focus to the problem of evaluating fairness of treatment allocation in clinical practice from a causal perspective. We develop a causal fairness algorithm for assessing treatment allocation. By applying this fairness analysis method to a cohort of patients with coronary artery disease from EHR data, we uncover disparities in treatment allocation based on gender and race, highlighting the importance of addressing fairness concerns in clinical practice. Furthermore, we demonstrate that social determinants of health, variables that are often unavailable in EHR databases and are potential unmeasured confounders, do not significantly impact the estimation of treatment responses when conditioned on clinical features from EHR data, shedding light on the intricate relationship between EHR features and social determinants of health.
Collectively, this dissertation contributes valuable insights into addressing unmeasured confounding in the context of evidence generation from EHRs. These findings have significant implications for improving the reliability of observational studies and promoting equitable healthcare practices
Adjusting for indirectly measured confounding using large-scale propensity scores
Confounding remains one of the major challenges to causal inference with
observational data. This problem is paramount in medicine, where we would like
to answer causal questions from large observational datasets like electronic
health records (EHRs). Modern medical data (such as EHRs) typically contain
tens of thousands of covariates. Such a large set carries hope that many of the
confounders are directly measured, and further hope that others are indirectly
measured through their correlation with measured covariates. How can we exploit
these large sets of covariates for causal inference? To help answer this
question, this paper examines the performance of the large-scale propensity
score (LSPS) approach on causal analysis of medical data. We demonstrate that
LSPS may adjust for indirectly measured confounders by including tens of
thousands of covariates that may be correlated with them. We present conditions
under which LSPS removes bias due to indirectly measured confounders, and we
show that LSPS may avoid bias when inadvertently adjusting for variables (like
colliders) that otherwise can induce bias. We demonstrate the performance of
LSPS with both simulated medical data and real medical data.Comment: 12 pages, 6 figure
Evaluation of translocation impacts on genetic patterns in farmed and naturalized populations of Mytilus galloprovincialis along the China coast: clues from mitochondrial cytochrome c oxidase I sequences
As an introduced species, Mytilus galloprovincialis has developed into self-sustaining naturalized populations and has been widely cultivated in northern China. The M. galloprovincialis aquaculture industry wholly depends on the movement of naturalized juveniles onto farms. It is, therefore, necessary to understand the genetic effect of continuous spats’ translocation. This study divided 12 localities of M. galloprovincialis along the China coast into three types of populations—farmed, naturalized adjacent farmed, and isolated—to investigate the genetic variation and differentiation. The genetic variability is reflected by haplotype diversity, nucleotide diversity, and the mean number of pairwise differences expressed as farmed populations > naturalized adjacent farmed populations > isolated populations. The Hierarchical analyses and Mantel-test indicated slight divergence between farmed and naturalized populations, northern and southern populations. The farmed and naturalized populations clustered into two separate categories in the neighbor-joining tree except two anthropogenically intervened localities. The present results suggest that the translocation practice positively affected genetic variability and played a vital role in shaping genetic composition. The information obtained in this study provides new insights into the impacts of the translocation culture model of marine mollusks
A Bayesian Causal Inference Approach for Assessing Fairness in Clinical Decision-Making
Fairness in clinical decision-making is a critical element of health equity,
but assessing fairness of clinical decisions from observational data is
challenging. Recently, many fairness notions have been proposed to quantify
fairness in decision-making, among which causality-based fairness notions have
gained increasing attention due to its potential in adjusting for confounding
and reasoning about bias. However, causal fairness notions remain
under-explored in the context of clinical decision-making with large-scale
healthcare data. In this work, we propose a Bayesian causal inference approach
for assessing a causal fairness notion called principal fairness in clinical
settings. We demonstrate our approach using both simulated data and electronic
health records (EHR) data
3-Cinnamoyl-4-hydroxy-6-methyl-2H-pyran-2-one ameliorates diabetic peripheral neuropathy in type 2 diabetes mellitus rats via PI3K/Akt signaling pathway
Purpose: To investigate the curative effects of 3-cinnamoyl-4-hydroxy-6-methyl-2H-pyran-2-one (CHMP) on streptozotocin (STZ)-induced model of diabetic SD rats, and the underlying mechanism.
Method: Diabetes was induced in rats using single intraperitoneal injection of STZ. Subsequently, diabetic and non-diabetic rats were randomly grouped into five experimental groups. Six weeks after the STZ-injection, the diabetic animals were orally administered test compound (CHMP) at two doses of 10 and 20 mg/kg body weight for 6 weeks. Thereafter, the rats were anesthetised, and body weight, blood sugar, and motor nerve conduction velocity (MNCV) were determined. Moreover, real time-polymerase chain reaction (RT-PCR) and western blot analysis were used to assay the expression levels of genes in PIK3/Akt pathway and Glut4.
Results: Treatment of diabetic rats with CHMP significantly reduced levels of fasting blood glucose and enhanced average rat body weight, relative to diabetic control (p ˂ 0.05). Motor nerve conduction velocity (MNCV) was remarkably increased in CHMP-treated rats (54.2 ± 2.2), when compared to the diabetic control rats (46 ± 4.1, p < 0.01). Results from RT-PCR and western blot indicated increased expressions of PI3K, Akt and IRS-1, and down regulation of GSK-3B expression in skeletal muscle. The CHMP treatment also upregulated the Glut4 expression in skeletal muscle.
Conclusion: These findings show that CHMP may be beneficial in the management of diabetic neuropath
CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines
Synthetic Electronic Health Records (EHR) have emerged as a pivotal tool in
advancing healthcare applications and machine learning models, particularly for
researchers without direct access to healthcare data. Although existing
methods, like rule-based approaches and generative adversarial networks (GANs),
generate synthetic data that resembles real-world EHR data, these methods often
use a tabular format, disregarding temporal dependencies in patient histories
and limiting data replication. Recently, there has been a growing interest in
leveraging Generative Pre-trained Transformers (GPT) for EHR data. This enables
applications like disease progression analysis, population estimation,
counterfactual reasoning, and synthetic data generation. In this work, we focus
on synthetic data generation and demonstrate the capability of training a GPT
model using a particular patient representation derived from CEHR-BERT,
enabling us to generate patient sequences that can be seamlessly converted to
the Observational Medical Outcomes Partnership (OMOP) data format
- …