3 research outputs found
Welfare and Fairness in Multi-objective Reinforcement Learning
We study fair multi-objective reinforcement learning in which an agent must
learn a policy that simultaneously achieves high reward on multiple dimensions
of a vector-valued reward. Motivated by the fair resource allocation
literature, we model this as an expected welfare maximization problem, for some
non-linear fair welfare function of the vector of long-term cumulative rewards.
One canonical example of such a function is the Nash Social Welfare, or
geometric mean, the log transform of which is also known as the Proportional
Fairness objective. We show that even approximately optimal optimization of the
expected Nash Social Welfare is computationally intractable even in the tabular
case. Nevertheless, we provide a novel adaptation of Q-learning that combines
non-linear scalarized learning updates and non-stationary action selection to
learn effective policies for optimizing nonlinear welfare functions. We show
that our algorithm is provably convergent, and we demonstrate experimentally
that our approach outperforms techniques based on linear scalarization,
mixtures of optimal linear scalarizations, or stationary action selection for
the Nash Social Welfare Objective.Comment: 9 page
Reliable Generation of EHR Time Series via Diffusion Models
Electronic Health Records (EHRs) are rich sources of patient-level data,
including laboratory tests, medications, and diagnoses, offering valuable
resources for medical data analysis. However, concerns about privacy often
restrict access to EHRs, hindering downstream analysis. Researchers have
explored various methods for generating privacy-preserving EHR data. In this
study, we introduce a new method for generating diverse and realistic synthetic
EHR time series data using Denoising Diffusion Probabilistic Models (DDPM). We
conducted experiments on six datasets, comparing our proposed method with eight
existing methods. Our results demonstrate that our approach significantly
outperforms all existing methods in terms of data utility while requiring less
training effort. Our approach also enhances downstream medical data analysis by
providing diverse and realistic synthetic EHR data
Fast and Interpretable Mortality Risk Scores for Critical Care Patients
Prediction of mortality in intensive care unit (ICU) patients is an important
task in critical care medicine. Prior work in creating mortality risk models
falls into two major categories: domain-expert-created scoring systems, and
black box machine learning (ML) models. Both of these have disadvantages: black
box models are unacceptable for use in hospitals, whereas manual creation of
models (including hand-tuning of logistic regression parameters) relies on
humans to perform high-dimensional constrained optimization, which leads to a
loss in performance. In this work, we bridge the gap between accurate black box
models and hand-tuned interpretable models. We build on modern interpretable ML
techniques to design accurate and interpretable mortality risk scores. We
leverage the largest existing public ICU monitoring datasets, namely the MIMIC
III and eICU datasets. By evaluating risk across medical centers, we are able
to study generalization across domains. In order to customize our risk score
models, we develop a new algorithm, GroupFasterRisk, which has several
important benefits: (1) it uses hard sparsity constraint, allowing users to
directly control the number of features; (2) it incorporates group sparsity to
allow more cohesive models; (3) it allows for monotonicity correction on models
for including domain knowledge; (4) it produces many equally-good models at
once, which allows domain experts to choose among them. GroupFasterRisk creates
its risk scores within hours, even on the large datasets we study here.
GroupFasterRisk's risk scores perform better than risk scores currently used in
hospitals, and have similar prediction performance to black box ML models
(despite being much sparser). Because GroupFasterRisk produces a variety of
risk scores and handles constraints, it allows design flexibility, which is the
key enabler of practical and trustworthy model creation