2 research outputs found
Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming
International audienceThis note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem with three states and four actions may grow arbitrarily. Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and -policy iteration algorithms are not strongly polynomial