1 research outputs found
Approximate Dynamic Programming based on Projection onto the (min,+) subsemimodule
We develop a new Approximate Dynamic Programming (ADP) method for infinite
horizon discounted reward Markov Decision Processes (MDP) based on projection
onto a subsemimodule. We approximate the value function in terms of a
linear combination of a set of basis functions whose
linear span constitutes a subsemimodule. The projection operator is closely
related to the Fenchel transform. Our approximate solution obeys the
Projected Bellman Equation (MPPBE) which is different from the conventional
Projected Bellman Equation (PBE). We show that the approximation error is
bounded in its -norm. We develop a Min-Plus Approximate Dynamic
Programming (MPADP) algorithm to compute the solution to the MPPBE. We also
present the proof of convergence of the MPADP algorithm and apply it to two
problems, a grid-world problem in the discrete domain and mountain car in the
continuous domain.Comment: 20 pages, 6 figures (including tables), 1 algorithm, a convergence
proo