2,566 research outputs found
Deep Reinforcement Learning for Sepsis Treatment
Sepsis is a leading cause of mortality in intensive care units and costs
hospitals billions annually. Treating a septic patient is highly challenging,
because individual patients respond very differently to medical interventions
and there is no universally agreed-upon treatment for sepsis. In this work, we
propose an approach to deduce treatment policies for septic patients by using
continuous state-space models and deep reinforcement learning. Our model learns
clinically interpretable treatment policies, similar in important aspects to
the treatment policies of physicians. The learned policies could be used to aid
intensive care clinicians in medical decision making and improve the likelihood
of patient survival.Comment: Extensions on earlier work (arXiv:1705.08422). Accepted at workshop
on Machine Learning For Health at the conference on Neural Information
Processing Systems, 201
Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning
Sepsis is the leading cause of mortality in the ICU. It is challenging to
manage because individual patients respond differently to treatment. Thus,
tailoring treatment to the individual patient is essential for the best
outcomes. In this paper, we take steps toward this goal by applying a
mixture-of-experts framework to personalize sepsis treatment. The mixture model
selectively alternates between neighbor-based (kernel) and deep reinforcement
learning (DRL) experts depending on patient's current history. On a large
retrospective cohort, this mixture-based approach outperforms physician, kernel
only, and DRL-only experts.Comment: AMIA 2018 Annual Symposiu
Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach
Sepsis is a leading cause of mortality in intensive care units (ICUs) and
costs hospitals billions annually. Treating a septic patient is highly
challenging, because individual patients respond very differently to medical
interventions and there is no universally agreed-upon treatment for sepsis.
Understanding more about a patient's physiological state at a given time could
hold the key to effective treatment policies. In this work, we propose a new
approach to deduce optimal treatment policies for septic patients by using
continuous state-space models and deep reinforcement learning. Learning
treatment policies over continuous spaces is important, because we retain more
of the patient's physiological information. Our model is able to learn
clinically interpretable treatment policies, similar in important aspects to
the treatment policies of physicians. Evaluating our algorithm on past ICU
patient data, we find that our model could reduce patient mortality in the
hospital by up to 3.6% over observed clinical policies, from a baseline
mortality of 13.7%. The learned treatment policies could be used to aid
intensive care clinicians in medical decision making and improve the likelihood
of patient survival
Precision medicine as a control problem: Using simulation and deep reinforcement learning to discover adaptive, personalized multi-cytokine therapy for sepsis
Sepsis is a life-threatening condition affecting one million people per year
in the US in which dysregulation of the body's own immune system causes damage
to its tissues, resulting in a 28 - 50% mortality rate. Clinical trials for
sepsis treatment over the last 20 years have failed to produce a single
currently FDA approved drug treatment. In this study, we attempt to discover an
effective cytokine mediation treatment strategy for sepsis using a previously
developed agent-based model that simulates the innate immune response to
infection: the Innate Immune Response agent-based model (IIRABM). Previous
attempts at reducing mortality with multi-cytokine mediation using the IIRABM
have failed to reduce mortality across all patient parameterizations and
motivated us to investigate whether adaptive, personalized multi-cytokine
mediation can control the trajectory of sepsis and lower patient mortality. We
used the IIRABM to compute a treatment policy in which systemic patient
measurements are used in a feedback loop to inform future treatment. Using deep
reinforcement learning, we identified a policy that achieves 0% mortality on
the patient parameterization on which it was trained. More importantly, this
policy also achieves 0.8% mortality over 500 randomly selected patient
parameterizations with baseline mortalities ranging from 1 - 99% (with an
average of 49%) spanning the entire clinically plausible parameter space of the
IIRABM. These results suggest that adaptive, personalized multi-cytokine
mediation therapy could be a promising approach for treating sepsis. We hope
that this work motivates researchers to consider such an approach as part of
future clinical trials. To the best of our knowledge, this work is the first to
consider adaptive, personalized multi-cytokine mediation therapy for sepsis,
and is the first to exploit deep reinforcement learning on a biological
simulation
Truly Batch Apprenticeship Learning with Deep Successor Features
We introduce a novel apprenticeship learning algorithm to learn an expert's
underlying reward structure in off-policy model-free \emph{batch} settings.
Unlike existing methods that require a dynamics model or additional data
acquisition for on-policy evaluation, our algorithm requires only the batch
data of observed expert behavior. Such settings are common in real-world
tasks---health care, finance or industrial processes ---where accurate
simulators do not exist or data acquisition is costly. To address challenges in
batch settings, we introduce Deep Successor Feature Networks(DSFN) that
estimate feature expectations in an off-policy setting and a
transition-regularized imitation network that produces a near-expert initial
policy and an efficient feature representation. Our algorithm achieves superior
results in batch settings on both control benchmarks and a vital clinical task
of sepsis management in the Intensive Care Unit.Comment: 10 pages, 3 figures, Under Conference Revie
Model-Based Reinforcement Learning for Sepsis Treatment
Sepsis is a dangerous condition that is a leading cause of patient mortality.
Treating sepsis is highly challenging, because individual patients respond very
differently to medical interventions and there is no universally agreed-upon
treatment for sepsis. In this work, we explore the use of continuous
state-space model-based reinforcement learning (RL) to discover high-quality
treatment policies for sepsis patients. Our quantitative evaluation reveals
that by blending the treatment strategy discovered with RL with what clinicians
follow, we can obtain improved policies, potentially allowing for better
medical treatment for sepsis.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018
arXiv:1811.0721
Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients
Glycemic control is essential for critical care. However, it is a challenging
task because there has been no study on personalized optimal strategies for
glycemic control. This work aims to learn personalized optimal glycemic
trajectories for severely ill septic patients by learning data-driven policies
to identify optimal targeted blood glucose levels as a reference for
clinicians. We encoded patient states using a sparse autoencoder and adopted a
reinforcement learning paradigm using policy iteration to learn the optimal
policy from data. We also estimated the expected return following the policy
learned from the recorded glycemic trajectories, which yielded a function
indicating the relationship between real blood glucose values and 90-day
mortality rates. This suggests that the learned optimal policy could reduce the
patients' estimated 90-day mortality rate by 6.3%, from 31% to 24.7%. The
result demonstrates that reinforcement learning with appropriate patient state
encoding can potentially provide optimal glycemic trajectories and allow
clinicians to design a personalized strategy for glycemic control in septic
patients.Comment: Accepted by the 31st Annual Conference on Neural Information
Processing Systems (NIPS 2017) Workshop on Machine Learning for Health (ML4H
Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs
Health-related data is noisy and stochastic in implying the true
physiological states of patients, limiting information contained in
single-moment observations for sequential clinical decision making. We model
patient-clinician interactions as partially observable Markov decision
processes (POMDPs) and optimize sequential treatment based on belief states
inferred from history sequence. To facilitate inference, we build a variational
generative model and boost state representation with a recurrent neural network
(RNN), incorporating an auxiliary loss from sequence auto-encoding. Meanwhile,
we optimize a continuous policy of drug levels with an actor-critic method
where policy gradients are obtained from a stablized off-policy estimate of
advantage function, with the value of belief state backed up by parallel
best-first suffix trees. We exploit our methodology in optimizing dosages of
vasopressor and intravenous fluid for sepsis patients using a retrospective
intensive care dataset and evaluate the learned policy with off-policy policy
evaluation (OPPE). The results demonstrate that modelling as POMDPs yields
better performance than MDPs, and that incorporating heuristic search improves
sample efficiency
Deep Reinforcement Learning for Optimal Critical Care Pain Management with Morphine using Dueling Double-Deep Q Networks
Opioids are the preferred medications for the treatment of pain in the
intensive care unit. While undertreatment leads to unrelieved pain and poor
clinical outcomes, excessive use of opioids puts patients at risk of
experiencing multiple adverse effects. In this work, we present a sequential
decision making framework for opioid dosing based on deep reinforcement
learning. It provides real-time clinically interpretable dosing
recommendations, personalized according to each patient's evolving pain and
physiological condition. We focus on morphine, one of the most commonly
prescribed opioids. To train and evaluate the model, we used retrospective data
from the publicly available MIMIC-3 database. Our results demonstrate that
reinforcement learning may be used to aid decision making in the intensive care
setting by providing personalized pain management interventions.Comment: 2019 41st Annual International Conference of the IEEE Engineering in
Medicine & Biology Society (EMBC
Inverse Reinforcement Learning in Contextual MDPs
We consider the task of Inverse Reinforcement Learning in Contextual Markov
Decision Processes (MDPs). In this setting, contexts, which define the reward
and transition kernel, are sampled from a distribution. In addition, although
the reward is a function of the context, it is not provided to the agent.
Instead, the agent observes demonstrations from an optimal policy. The goal is
to learn the reward mapping, such that the agent will act optimally even when
encountering previously unseen contexts, also known as zero-shot transfer. We
formulate this problem as a non-differential convex optimization problem and
propose a novel algorithm to compute its subgradients. Based on this scheme, we
analyze several methods both theoretically, where we compare the sample
complexity and scalability, and empirically. Most importantly, we show both
theoretically and empirically that our algorithms perform zero-shot transfer
(generalize to new and unseen contexts). Specifically, we present empirical
experiments in a dynamic treatment regime, where the goal is to learn a reward
function which explains the behavior of expert physicians based on recorded
data of them treating patients diagnosed with sepsis
- …