We present a dynamic programming-based solution to the problem of maximizing
the probability of attaining a target set before hitting a cemetery set for a
discrete-time Markov control process. Under mild hypotheses we establish that
there exists a deterministic stationary policy that achieves the maximum value
of this probability. We demonstrate how the maximization of this probability
can be computed through the maximization of an expected total reward until the
first hitting time to either the target or the cemetery set. Martingale
characterizations of thrifty, equalizing, and optimal policies in the context
of our problem are also established.Comment: 22 pages, 1 figure. Revise