Objective: The artificial pancreas (AP) has shown promising potential in
achieving closed-loop glucose control for individuals with type 1 diabetes
mellitus (T1DM). However, designing an effective control policy for the AP
remains challenging due to the complex physiological processes, delayed insulin
response, and inaccurate glucose measurements. While model predictive control
(MPC) offers safety and stability through the dynamic model and safety
constraints, it lacks individualization and is adversely affected by
unannounced meals. Conversely, deep reinforcement learning (DRL) provides
personalized and adaptive strategies but faces challenges with distribution
shifts and substantial data requirements. Methods: We propose a hybrid control
policy for the artificial pancreas (HyCPAP) to address the above challenges.
HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the
strengths of both policies while compensating for their respective limitations.
To facilitate faster deployment of AP systems in real-world settings, we
further incorporate meta-learning techniques into HyCPAP, leveraging previous
experience and patient-shared knowledge to enable fast adaptation to new
patients with limited available data. Results: We conduct extensive experiments
using the FDA-accepted UVA/Padova T1DM simulator across three scenarios. Our
approaches achieve the highest percentage of time spent in the desired
euglycemic range and the lowest occurrences of hypoglycemia. Conclusion: The
results clearly demonstrate the superiority of our methods for closed-loop
glucose management in individuals with T1DM. Significance: The study presents
novel control policies for AP systems, affirming the great potential of
proposed methods for efficient closed-loop glucose control.Comment: 12 page