802 research outputs found

    Q-Learning Based Closed-Loop Control of Anesthesia Administration by Accounting for Hemodynamic Parameter Variations

    Get PDF
    Critically ill patients in the intensive care units (ICUs) are often in acutely disturbed state of mind characterized by restlessness, illusions, and nervousness. Such patients, for instance, those who are mechanically ventilated may incur difficulties during treatment procedures such as endotracheal tube intubation/extubation. Apart from critical illness, treatment induced delirium may cause them to dislodge themselves from life-saving equipment and thus hinder cooperative and safe treatment in the ICU. Hence, it is often recommended to moderately sedate such patients for several days to reduce patient anxiety, facilitate sleep, aid treatment and thus endure patient safety. However, most anesthetics affect cardiac and respiratory functions. Hence, it is important to monitor and control the infusion of anesthetics to meet sedation requirements while keeping patient vital parameters within safe limits. The critical task of anesthesia administration also necessitates that drug dosing be optimal, patient specific, and robust. Towards this end, we propose to use a reinforcement learning based approach to develop a closed-loop anesthesia controller that accounts for hemodynamic parameter variations. Main advantage of the proposed approach is that it does not require a model, it involves optimization, and is robust to interpatient variabilities. We formulate the problem of deriving control laws that track a desired trajectory as a sequential decision making problem represented by a finite Markov decision process (MDP) and then use reinforcement learning-based approach to solve the MDPs for goal oriented decision making. Specifically, we use reinforcement learning approaches, such as Q-learning, to develop a closed-loop anesthesia controller using the bispectral index (BIS) as a control variable while concurrently accounting for the mean arterial pressure (MAP). Moreover, the proposed method monitors and controls the infusion of anesthetics by minimizing a weighted combination of the error of the BIS and MAP signals. Account for two variables by considering the error reduces the computational complexity of the reinforcement learning algorithm and consequently the controller processing time. We present simulation results and statistical results using the 30 simulated patients. For our simulations, the pharmacokinetic and the pharmacodynamic values of the simulated patients are chosen randomly from a predefined range. To quantify the performance of the trained agent in the closed-loop anesthesia control, we use the median performance error (MDPE), median absolute performance error (MDAPE), root mean square error (RMSE), and interquartile range (IQ). In order to further investigate the effect of simultaneous regulation of the BIS and MAP parameters on the sedation level (BIS) of a patient, we also conducted three different in silico case studies. In the first case study, a hemodynamic disturbance is considered in which the MAP is altered by d units. This case study considers the effect of other factors such as hemorrhage on MAP as an exogenous disturbance. In the second case study, the MAP is set to a constant value irrespective of propofol infusion, which corresponds to patients that remain intubated in the ICU with post-aortic aneurysm repair or septic patients with respiratory failure. In the third case study, a disturbance due to administration of a synergetic drug such as remifentanil is considered during the administration of propofol. This case study considers the effect of drug interaction on the closed-loop control of hypnotic agent administration.qscienc

    Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

    Full text link
    A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes. Hidden confounding can compromise the validity of any causal conclusion drawn from data and presents a major obstacle to effective offline RL. In the present paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to hidden confounding bias, termed delphic uncertainty, which uses variation over world models compatible with the observations, and differentiate it from the well-known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved confounders, and attempts to reduce the amount of confounding bias. We demonstrate through extensive experiments and ablations the efficacy of our approach on a sepsis management benchmark, as well as on electronic health records. Our results suggest that nonidentifiable hidden confounding bias can be mitigated to improve offline RL solutions in practice

    Committed to Safety: Ten Case Studies on Reducing Harm to Patients

    Get PDF
    Presents case studies of healthcare organizations, clinical teams, and learning collaborations to illustrate successful innovations for improving patient safety nationwide. Includes actions taken, results achieved, lessons learned, and recommendations

    Machine learning approaches to optimise the management of patients with sepsis

    Get PDF
    The goal of this PhD was to generate novel tools to improve the management of patients with sepsis, by applying machine learning techniques on routinely collected electronic health records. Machine learning is an application of artificial intelligence (AI), where a machine analyses data and becomes able to execute complex tasks without being explicitly programmed. Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients. This represents a key clinical challenge and a top research priority. The main contribution of the research has been the development of a reinforcement learning framework and algorithms, in order to tackle this sequential decision-making problem. The model was built and then validated on three large non-overlapping intensive care databases, containing data collected from adult patients in the U.S.A and the U.K. Our agent extracted implicit knowledge from an amount of patient data that exceeds many-fold the life-time experience of human clinicians and learned optimal treatment by having analysed myriads of (mostly sub-optimal) treatment decisions. We used state-of-the-art evaluation techniques (called high confidence off-policy evaluation) and demonstrated that the value of the treatment strategy of the AI agent was on average reliably higher than the human clinicians. In two large validation cohorts independent from the training data, mortality was the lowest in patients where clinicians’ actual doses matched the AI policy. We also gained insight into the model representations and confirmed that the AI agent relied on clinically and biologically meaningful parameters when making its suggestions. We conducted extensive testing and exploration of the behaviour of the AI agent down to the level of individual patient trajectories, identified potential sources of inappropriate behaviour and offered suggestions for future model refinements. If validated, our model could provide individualized and clinically interpretable treatment decisions for sepsis that may improve patient outcomes.Open Acces
    corecore