8 research outputs found
A User Study on Explainable Online Reinforcement Learning for Adaptive Systems
Online reinforcement learning (RL) is increasingly used for realizing
adaptive systems in the presence of design time uncertainty. Online RL
facilitates learning from actual operational data and thereby leverages
feedback only available at runtime. However, Online RL requires the definition
of an effective and correct reward function, which quantifies the feedback to
the RL algorithm and thereby guides learning. With Deep RL gaining interest,
the learned knowledge is no longer explicitly represented, but is represented
as a neural network. For a human, it becomes practically impossible to relate
the parametrization of the neural network to concrete RL decisions. Deep RL
thus essentially appears as a black box, which severely limits the debugging of
adaptive systems. We previously introduced the explainable RL technique
XRL-DINE, which provides visual insights into why certain decisions were made
at important time points. Here, we introduce an empirical user study involving
54 software engineers from academia and industry to assess (1) the performance
of software engineers when performing different tasks using XRL-DINE and (2)
the perceived usefulness and ease of use of XRL-DINE.Comment: arXiv admin note: substantial text overlap with arXiv:2210.0593