Q-learning with censored data

Goldberg, Yair; Kosorok, Michael R.

research

Q-learning with censored data

Authors: Yair Goldberg
Michael R. Kosorok
Publication date: 1 January 2012
Publisher: 'Institute of Mathematical Statistics'
Doi

Abstract

We develop methodology for a multistage decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the optimal Q-function belongs to the approximation space, the expected survival time for policies obtained by the algorithm converges to that of the optimal policy. We simulate a multistage clinical trial with flexible number of stages and apply the proposed censored-Q-learning algorithm to find individualized treatment regimens. The methodology presented in this paper has implications in the design of personalized medicine trials in cancer and in other life-threatening diseases.Comment: Published in at http://dx.doi.org/10.1214/12-AOS968 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

info:doi/10.1214%2F12-aos968

Last time updated on 01/04/2019

Carolina Digital Repository

cdr.lib.unc.edu:pz50h3124

Last time updated on 24/11/2020