5,485 research outputs found
Causal Rule Learning: Enhancing the Understanding of Heterogeneous Treatment Effect via Weighted Causal Rules
Interpretability is a key concern in estimating heterogeneous treatment
effects using machine learning methods, especially for healthcare applications
where high-stake decisions are often made. Inspired by the Predictive,
Descriptive, Relevant framework of interpretability, we propose causal rule
learning which finds a refined set of causal rules characterizing potential
subgroups to estimate and enhance our understanding of heterogeneous treatment
effects. Causal rule learning involves three phases: rule discovery, rule
selection, and rule analysis. In the rule discovery phase, we utilize a causal
forest to generate a pool of causal rules with corresponding subgroup average
treatment effects. The selection phase then employs a D-learning method to
select a subset of these rules to deconstruct individual-level treatment
effects as a linear combination of the subgroup-level effects. This helps to
answer an ignored question by previous literature: what if an individual
simultaneously belongs to multiple groups with different average treatment
effects? The rule analysis phase outlines a detailed procedure to further
analyze each rule in the subset from multiple perspectives, revealing the most
promising rules for further validation. The rules themselves, their
corresponding subgroup treatment effects, and their weights in the linear
combination give us more insights into heterogeneous treatment effects.
Simulation and real-world data analysis demonstrate the superior performance of
causal rule learning on the interpretable estimation of heterogeneous treatment
effect when the ground truth is complex and the sample size is sufficient
Online Modeling and Monitoring of Dependent Processes under Resource Constraints
Adaptive monitoring of a large population of dynamic processes is critical
for the timely detection of abnormal events under limited resources in many
healthcare and engineering systems. Examples include the risk-based disease
screening and condition-based process monitoring. However, existing adaptive
monitoring models either ignore the dependency among processes or overlook the
uncertainty in process modeling. To design an optimal monitoring strategy that
accurately monitors the processes with poor health conditions and actively
collects information for uncertainty reduction, a novel online collaborative
learning method is proposed in this study. The proposed method designs a
collaborative learning-based upper confidence bound (CL-UCB) algorithm to
optimally balance the exploitation and exploration of dependent processes under
limited resources. Efficiency of the proposed method is demonstrated through
theoretical analysis, simulation studies and an empirical study of adaptive
cognitive monitoring in Alzheimer's disease
A Tree-based Federated Learning Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources
Federated learning is an appealing framework for analyzing sensitive data
from distributed health data networks due to its protection of data privacy.
Under this framework, data partners at local sites collaboratively build an
analytical model under the orchestration of a coordinating site, while keeping
the data decentralized. However, existing federated learning methods mainly
assume data across sites are homogeneous samples of the global population,
hence failing to properly account for the extra variability across sites in
estimation and inference. Drawing on a multi-hospital electronic health records
network, we develop an efficient and interpretable tree-based ensemble of
personalized treatment effect estimators to join results across hospital sites,
while actively modeling for the heterogeneity in data sources through site
partitioning. The efficiency of our method is demonstrated by a study of causal
effects of oxygen saturation on hospital mortality and backed up by
comprehensive numerical results
Causal Inference under Data Restrictions
This dissertation focuses on modern causal inference under uncertainty and
data restrictions, with applications to neoadjuvant clinical trials,
distributed data networks, and robust individualized decision making.
In the first project, we propose a method under the principal stratification
framework to identify and estimate the average treatment effects on a binary
outcome, conditional on the counterfactual status of a post-treatment
intermediate response. Under mild assumptions, the treatment effect of interest
can be identified. We extend the approach to address censored outcome data. The
proposed method is applied to a neoadjuvant clinical trial and its performance
is evaluated via simulation studies.
In the second project, we propose a tree-based model averaging approach to
improve the estimation accuracy of conditional average treatment effects at a
target site by leveraging models derived from other potentially heterogeneous
sites, without them sharing subject-level data. The performance of this
approach is demonstrated by a study of the causal effects of oxygen therapy on
hospital survival rates and backed up by comprehensive simulations.
In the third project, we propose a robust individualized decision learning
framework with sensitive variables to improve the worst-case outcomes of
individuals caused by sensitive variables that are unavailable at the time of
decision. Unlike most existing work that uses mean-optimal objectives, we
propose a robust learning framework by finding a newly defined quantile- or
infimum-optimal decision rule. From a causal perspective, we also generalize
the classic notion of (average) fairness to conditional fairness for individual
subjects. The reliable performance of the proposed method is demonstrated
through synthetic experiments and three real-data applications.Comment: PhD dissertation, University of Pittsburgh. The contents are mostly
based on arXiv:2211.06569, arXiv:2103.06261 and arXiv:2103.04175 with
extended discussion
Deep Causal Learning for Robotic Intelligence
This invited review discusses causal learning in the context of robotic
intelligence. The paper introduced the psychological findings on causal
learning in human cognition, then it introduced the traditional statistical
solutions on causal discovery and causal inference. The paper reviewed recent
deep causal learning algorithms with a focus on their architectures and the
benefits of using deep nets and discussed the gap between deep causal learning
and the needs of robotic intelligence
The Role of the Mangement Sciences in Research on Personalization
We present a review of research studies that deal with personalization. We synthesize current knowledge about these areas, and identify issues that we envision will be of interest to researchers working in the management sciences. We take an interdisciplinary approach that spans the areas of economics, marketing, information technology, and operations. We present an overarching framework for personalization that allows us to identify key players in the personalization process, as well as, the key stages of personalization. The framework enables us to examine the strategic role of personalization in the interactions between a firm and other key players in the firm's value system. We review extant literature in the strategic behavior of firms, and discuss opportunities for analytical and empirical research in this regard. Next, we examine how a firm can learn a customer's preferences, which is one of the key components of the personalization process. We use a utility-based approach to formalize such preference functions, and to understand how these preference functions could be learnt based on a customer's interactions with a firm. We identify well-established techniques in management sciences that can be gainfully employed in future research on personalization.CRM, Persoanlization, Marketing, e-commerce,
Synthetic Observational Health Data with GANs: from slow adoption to a boom in medical research and ultimately digital twins?
After being collected for patient care, Observational Health Data (OHD) can
further benefit patient well-being by sustaining the development of health
informatics and medical research. Vast potential is unexploited because of the
fiercely private nature of patient-related data and regulations to protect it.
Generative Adversarial Networks (GANs) have recently emerged as a
groundbreaking way to learn generative models that produce realistic synthetic
data. They have revolutionized practices in multiple domains such as
self-driving cars, fraud detection, digital twin simulations in industrial
sectors, and medical imaging.
The digital twin concept could readily apply to modelling and quantifying
disease progression. In addition, GANs posses many capabilities relevant to
common problems in healthcare: lack of data, class imbalance, rare diseases,
and preserving privacy. Unlocking open access to privacy-preserving OHD could
be transformative for scientific research. In the midst of COVID-19, the
healthcare system is facing unprecedented challenges, many of which of are data
related for the reasons stated above.
Considering these facts, publications concerning GAN applied to OHD seemed to
be severely lacking. To uncover the reasons for this slow adoption, we broadly
reviewed the published literature on the subject. Our findings show that the
properties of OHD were initially challenging for the existing GAN algorithms
(unlike medical imaging, for which state-of-the-art model were directly
transferable) and the evaluation synthetic data lacked clear metrics.
We find more publications on the subject than expected, starting slowly in
2017, and since then at an increasing rate. The difficulties of OHD remain, and
we discuss issues relating to evaluation, consistency, benchmarking, data
modelling, and reproducibility.Comment: 31 pages (10 in previous version), not including references and
glossary, 51 in total. Inclusion of a large number of recent publications and
expansion of the discussion accordingl
- …