Search CORE

2,566 research outputs found

Deep Reinforcement Learning for Sepsis Treatment

Author: Ahmed Imran
Celi Leo
Ghassemi Marzyeh
Komorowski Matthieu
Raghu Aniruddh
Szolovits Peter
Publication venue
Publication date: 27/11/2017
Field of study

Sepsis is a leading cause of mortality in intensive care units and costs hospitals billions annually. Treating a septic patient is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we propose an approach to deduce treatment policies for septic patients by using continuous state-space models and deep reinforcement learning. Our model learns clinically interpretable treatment policies, similar in important aspects to the treatment policies of physicians. The learned policies could be used to aid intensive care clinicians in medical decision making and improve the likelihood of patient survival.Comment: Extensions on earlier work (arXiv:1705.08422). Accepted at workshop on Machine Learning For Health at the conference on Neural Information Processing Systems, 201

arXiv.org e-Print Archive

Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning

Author: Ding Yi
Doshi-Velez Finale
Faisal Aldo
Gottesman Omer
Komorowski Matthieu
Lehman Li-wei H.
Peng Xuefeng
Ross Andrew
Wihl David
Publication venue
Publication date: 15/01/2019
Field of study

Sepsis is the leading cause of mortality in the ICU. It is challenging to manage because individual patients respond differently to treatment. Thus, tailoring treatment to the individual patient is essential for the best outcomes. In this paper, we take steps toward this goal by applying a mixture-of-experts framework to personalize sepsis treatment. The mixture model selectively alternates between neighbor-based (kernel) and deep reinforcement learning (DRL) experts depending on patient's current history. On a large retrospective cohort, this mixture-based approach outperforms physician, kernel only, and DRL-only experts.Comment: AMIA 2018 Annual Symposiu

arXiv.org e-Print Archive

Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach

Author: Celi Leo Anthony
Ghassemi Marzyeh
Komorowski Matthieu
Raghu Aniruddh
Szolovits Peter
Publication venue
Publication date: 23/05/2017
Field of study

Sepsis is a leading cause of mortality in intensive care units (ICUs) and costs hospitals billions annually. Treating a septic patient is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. Understanding more about a patient's physiological state at a given time could hold the key to effective treatment policies. In this work, we propose a new approach to deduce optimal treatment policies for septic patients by using continuous state-space models and deep reinforcement learning. Learning treatment policies over continuous spaces is important, because we retain more of the patient's physiological information. Our model is able to learn clinically interpretable treatment policies, similar in important aspects to the treatment policies of physicians. Evaluating our algorithm on past ICU patient data, we find that our model could reduce patient mortality in the hospital by up to 3.6% over observed clinical policies, from a baseline mortality of 13.7%. The learned treatment policies could be used to aid intensive care clinicians in medical decision making and improve the likelihood of patient survival

arXiv.org e-Print Archive

Precision medicine as a control problem: Using simulation and deep reinforcement learning to discover adaptive, personalized multi-cytokine therapy for sepsis

Author: An Gary
Cockrell Chase
Faissol Daniel M.
Grathwohl Will S.
Petersen Brenden K.
Santiago Claudio
Yang Jiachen
Publication venue
Publication date: 08/02/2018
Field of study

Sepsis is a life-threatening condition affecting one million people per year in the US in which dysregulation of the body's own immune system causes damage to its tissues, resulting in a 28 - 50% mortality rate. Clinical trials for sepsis treatment over the last 20 years have failed to produce a single currently FDA approved drug treatment. In this study, we attempt to discover an effective cytokine mediation treatment strategy for sepsis using a previously developed agent-based model that simulates the innate immune response to infection: the Innate Immune Response agent-based model (IIRABM). Previous attempts at reducing mortality with multi-cytokine mediation using the IIRABM have failed to reduce mortality across all patient parameterizations and motivated us to investigate whether adaptive, personalized multi-cytokine mediation can control the trajectory of sepsis and lower patient mortality. We used the IIRABM to compute a treatment policy in which systemic patient measurements are used in a feedback loop to inform future treatment. Using deep reinforcement learning, we identified a policy that achieves 0% mortality on the patient parameterization on which it was trained. More importantly, this policy also achieves 0.8% mortality over 500 randomly selected patient parameterizations with baseline mortalities ranging from 1 - 99% (with an average of 49%) spanning the entire clinically plausible parameter space of the IIRABM. These results suggest that adaptive, personalized multi-cytokine mediation therapy could be a promising approach for treating sepsis. We hope that this work motivates researchers to consider such an approach as part of future clinical trials. To the best of our knowledge, this work is the first to consider adaptive, personalized multi-cytokine mediation therapy for sepsis, and is the first to exploit deep reinforcement learning on a biological simulation

arXiv.org e-Print Archive

Truly Batch Apprenticeship Learning with Deep Successor Features

Author: Doshi-Velez Finale
Lee Donghun
Srinivasan Srivatsan
Publication venue
Publication date: 24/03/2019
Field of study

We introduce a novel apprenticeship learning algorithm to learn an expert's underlying reward structure in off-policy model-free \emph{batch} settings. Unlike existing methods that require a dynamics model or additional data acquisition for on-policy evaluation, our algorithm requires only the batch data of observed expert behavior. Such settings are common in real-world tasks---health care, finance or industrial processes ---where accurate simulators do not exist or data acquisition is costly. To address challenges in batch settings, we introduce Deep Successor Feature Networks(DSFN) that estimate feature expectations in an off-policy setting and a transition-regularized imitation network that produces a near-expert initial policy and an efficient feature representation. Our algorithm achieves superior results in batch settings on both control benchmarks and a vital clinical task of sepsis management in the Intensive Care Unit.Comment: 10 pages, 3 figures, Under Conference Revie

arXiv.org e-Print Archive

Model-Based Reinforcement Learning for Sepsis Treatment

Author: Komorowski Matthieu
Raghu Aniruddh
Singh Sumeetpal
Publication venue
Publication date: 23/11/2018
Field of study

Sepsis is a dangerous condition that is a leading cause of patient mortality. Treating sepsis is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we explore the use of continuous state-space model-based reinforcement learning (RL) to discover high-quality treatment policies for sepsis patients. Our quantitative evaluation reveals that by blending the treatment strategy discovered with RL with what clinicians follow, we can obtain improved policies, potentially allowing for better medical treatment for sepsis.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.0721

arXiv.org e-Print Archive

Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients

Author: Gao Mingwu
He Ze
Szolovits Peter
Weng Wei-Hung
Yan Susu
Publication venue
Publication date: 02/12/2017
Field of study

Glycemic control is essential for critical care. However, it is a challenging task because there has been no study on personalized optimal strategies for glycemic control. This work aims to learn personalized optimal glycemic trajectories for severely ill septic patients by learning data-driven policies to identify optimal targeted blood glucose levels as a reference for clinicians. We encoded patient states using a sparse autoencoder and adopted a reinforcement learning paradigm using policy iteration to learn the optimal policy from data. We also estimated the expected return following the policy learned from the recorded glycemic trajectories, which yielded a function indicating the relationship between real blood glucose values and 90-day mortality rates. This suggests that the learned optimal policy could reduce the patients' estimated 90-day mortality rate by 6.3%, from 31% to 24.7%. The result demonstrates that reinforcement learning with appropriate patient state encoding can potentially provide optimal glycemic trajectories and allow clinicians to design a personalized strategy for glycemic control in septic patients.Comment: Accepted by the 31st Annual Conference on Neural Information Processing Systems (NIPS 2017) Workshop on Machine Learning for Health (ML4H

arXiv.org e-Print Archive

Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs

Author: Faisal Aldo A.
Komorowski Matthieu
Li Luchen
Publication venue
Publication date: 17/05/2019
Field of study

Health-related data is noisy and stochastic in implying the true physiological states of patients, limiting information contained in single-moment observations for sequential clinical decision making. We model patient-clinician interactions as partially observable Markov decision processes (POMDPs) and optimize sequential treatment based on belief states inferred from history sequence. To facilitate inference, we build a variational generative model and boost state representation with a recurrent neural network (RNN), incorporating an auxiliary loss from sequence auto-encoding. Meanwhile, we optimize a continuous policy of drug levels with an actor-critic method where policy gradients are obtained from a stablized off-policy estimate of advantage function, with the value of belief state backed up by parallel best-first suffix trees. We exploit our methodology in optimizing dosages of vasopressor and intravenous fluid for sepsis patients using a retrospective intensive care dataset and evaluate the learned policy with off-policy policy evaluation (OPPE). The results demonstrate that modelling as POMDPs yields better performance than MDPs, and that incorporating heuristic search improves sample efficiency

arXiv.org e-Print Archive

Deep Reinforcement Learning for Optimal Critical Care Pain Management with Morphine using Dueling Double-Deep Q Networks

Author: Eschenfeldt Patrick
Hur Chin
Ingram Myles
Lopez-Martinez Daniel
Ostvar Sassan
Picard Rosalind
Publication venue
Publication date: 24/04/2019
Field of study

Opioids are the preferred medications for the treatment of pain in the intensive care unit. While undertreatment leads to unrelieved pain and poor clinical outcomes, excessive use of opioids puts patients at risk of experiencing multiple adverse effects. In this work, we present a sequential decision making framework for opioid dosing based on deep reinforcement learning. It provides real-time clinically interpretable dosing recommendations, personalized according to each patient's evolving pain and physiological condition. We focus on morphine, one of the most commonly prescribed opioids. To train and evaluate the model, we used retrospective data from the publicly available MIMIC-3 database. Our results demonstrate that reinforcement learning may be used to aid decision making in the intensive care setting by providing personalized pain management interventions.Comment: 2019 41st Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC

arXiv.org e-Print Archive

Inverse Reinforcement Learning in Contextual MDPs

Author: Belogolovsky Stav
Korsunsky Philip
Mannor Shie
Tessler Chen
Zahavy Tom
Publication venue
Publication date: 30/12/2020
Field of study

We consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the reward is a function of the context, it is not provided to the agent. Instead, the agent observes demonstrations from an optimal policy. The goal is to learn the reward mapping, such that the agent will act optimally even when encountering previously unseen contexts, also known as zero-shot transfer. We formulate this problem as a non-differential convex optimization problem and propose a novel algorithm to compute its subgradients. Based on this scheme, we analyze several methods both theoretically, where we compare the sample complexity and scalability, and empirically. Most importantly, we show both theoretically and empirically that our algorithms perform zero-shot transfer (generalize to new and unseen contexts). Specifically, we present empirical experiments in a dynamic treatment regime, where the goal is to learn a reward function which explains the behavior of expert physicians based on recorded data of them treating patients diagnosed with sepsis

arXiv.org e-Print Archive