11 research outputs found

    Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care.

    Get PDF
    The aim of this work was to develop and evaluate the reinforcement learning algorithm VentAI, which is able to suggest a dynamically optimized mechanical ventilation regime for critically-ill patients. We built, validated and tested its performance on 11,943 events of volume-controlled mechanical ventilation derived from 61,532 distinct ICU admissions and tested it on an independent, secondary dataset (200,859 ICU stays; 25,086 mechanical ventilation events). A patient "data fingerprint" of 44 features was extracted as multidimensional time series in 4-hour time steps. We used a Markov decision process, including a reward system and a Q-learning approach, to find the optimized settings for positive end-expiratory pressure (PEEP), fraction of inspired oxygen (FiO2) and ideal body weight-adjusted tidal volume (Vt). The observed outcome was in-hospital or 90-day mortality. VentAI reached a significantly increased estimated performance return of 83.3 (primary dataset) and 84.1 (secondary dataset) compared to physicians' standard clinical care (51.1). The number of recommended action changes per mechanically ventilated patient constantly exceeded those of the clinicians. VentAI chose 202.9% more frequently ventilation regimes with lower Vt (5-7.5 mL/kg), but 50.8% less for regimes with higher Vt (7.5-10 mL/kg). VentAI recommended 29.3% more frequently PEEP levels of 5-7 cm H2O and 53.6% more frequently PEEP levels of 7-9 cmH2O. VentAI avoided high (>55%) FiO2 values (59.8% decrease), while preferring the range of 50-55% (140.3% increase). In conclusion, VentAI provides reproducible high performance by dynamically choosing an optimized, individualized ventilation strategy and thus might be of benefit for critically ill patients

    Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitation

    Get PDF
    Reinforcement learning (RL) has helped improve decision-making in several applications. However, applying traditional RL is challenging in some applications, such as rehabilitation of people with a spinal cord injury (SCI). Among other factors, using RL in this domain is difficult because there are many possible treatments (i.e., large action space) and few patients (i.e., limited training data). Treatments for SCIs have natural groupings, so we propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using an embedding technique. We then use Fitted Q Iteration to train an agent that learns optimal treatments. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that both methods can help improve the treatment decisions of physiotherapists, but the approach based on domain knowledge offers better performance

    Machine learning approaches to optimise the management of patients with sepsis

    Get PDF
    The goal of this PhD was to generate novel tools to improve the management of patients with sepsis, by applying machine learning techniques on routinely collected electronic health records. Machine learning is an application of artificial intelligence (AI), where a machine analyses data and becomes able to execute complex tasks without being explicitly programmed. Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients. This represents a key clinical challenge and a top research priority. The main contribution of the research has been the development of a reinforcement learning framework and algorithms, in order to tackle this sequential decision-making problem. The model was built and then validated on three large non-overlapping intensive care databases, containing data collected from adult patients in the U.S.A and the U.K. Our agent extracted implicit knowledge from an amount of patient data that exceeds many-fold the life-time experience of human clinicians and learned optimal treatment by having analysed myriads of (mostly sub-optimal) treatment decisions. We used state-of-the-art evaluation techniques (called high confidence off-policy evaluation) and demonstrated that the value of the treatment strategy of the AI agent was on average reliably higher than the human clinicians. In two large validation cohorts independent from the training data, mortality was the lowest in patients where clinicians’ actual doses matched the AI policy. We also gained insight into the model representations and confirmed that the AI agent relied on clinically and biologically meaningful parameters when making its suggestions. We conducted extensive testing and exploration of the behaviour of the AI agent down to the level of individual patient trajectories, identified potential sources of inappropriate behaviour and offered suggestions for future model refinements. If validated, our model could provide individualized and clinically interpretable treatment decisions for sepsis that may improve patient outcomes.Open Acces
    corecore