2 research outputs found
Ranking Policy Decisions
Policies trained via Reinforcement Learning (RL) are often needlessly
complex, making them difficult to analyse and interpret. In a run with time
steps, a policy will make decisions on actions to take; we conjecture that
only a small subset of these decisions delivers value over selecting a simple
default action. Given a trained policy, we propose a novel black-box method
based on statistical fault localisation that ranks the states of the
environment according to the importance of decisions made in those states. We
argue that among other things, the ranked list of states can help explain and
understand the policy. As the ranking method is statistical, a direct
evaluation of its quality is hard. As a proxy for quality, we use the ranking
to create new, simpler policies from the original ones by pruning decisions
identified as unimportant (that is, replacing them by default actions) and
measuring the impact on performance. Our experiments on a diverse set of
standard benchmarks demonstrate that pruned policies can perform on a level
comparable to the original policies. Conversely, we show that naive approaches
for ranking policy decisions, e.g., ranking based on the frequency of visiting
a state, do not result in high-performing pruned policies