The management of invasive mechanical ventilation, and the regulation of
sedation and analgesia during ventilation, constitutes a major part of the care
of patients admitted to intensive care units. Both prolonged dependence on
mechanical ventilation and premature extubation are associated with increased
risk of complications and higher hospital costs, but clinical opinion on the
best protocol for weaning patients off of a ventilator varies. This work aims
to develop a decision support tool that uses available patient information to
predict time-to-extubation readiness and to recommend a personalized regime of
sedation dosage and ventilator support. To this end, we use off-policy
reinforcement learning algorithms to determine the best action at a given
patient state from sub-optimal historical ICU data. We compare treatment
policies from fitted Q-iteration with extremely randomized trees and with
feedforward neural networks, and demonstrate that the policies learnt show
promise in recommending weaning protocols with improved outcomes, in terms of
minimizing rates of reintubation and regulating physiological stability