3 research outputs found
FoMo Rewards: Can we cast foundation models as reward functions?
We explore the viability of casting foundation models as generic reward
functions for reinforcement learning. To this end, we propose a simple pipeline
that interfaces an off-the-shelf vision model with a large language model.
Specifically, given a trajectory of observations, we infer the likelihood of an
instruction describing the task that the user wants an agent to perform. We
show that this generic likelihood function exhibits the characteristics ideally
expected from a reward function: it associates high values with the desired
behaviour and lower values for several similar, but incorrect policies.
Overall, our work opens the possibility of designing open-ended agents for
interactive tasks via foundation models.Comment: Accepted to NeurIPS FMDM worksho
State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding
With more complex AI systems used by non-AI experts to complete daily tasks,
there is an increasing effort to develop methods that produce explanations of
AI decision making understandable by non-AI experts. Towards this effort,
leveraging higher-level concepts and producing concept-based explanations have
become a popular method. Most concept-based explanations have been developed
for classification techniques, and we posit that the few existing methods for
sequential decision making are limited in scope. In this work, we first
contribute a desiderata for defining "concepts" in sequential decision making
settings. Additionally, inspired by the Protege Effect which states explaining
knowledge often reinforces one's self-learning, we explore the utility of
concept-based explanations providing a dual benefit to the RL agent by
improving agent learning rate, and to the end-user by improving end-user
understanding of agent decision making. To this end, we contribute a unified
framework, State2Explanation (S2E), that involves learning a joint embedding
model between state-action pairs and concept-based explanations, and leveraging
such learned model to both (1) inform reward shaping during an agent's
training, and (2) provide explanations to end-users at deployment for improved
task performance. Our experimental validations, in Connect 4 and Lunar Lander,
demonstrate the success of S2E in providing a dual-benefit, successfully
informing reward shaping and improving agent learning rate, as well as
significantly improving end user task performance at deployment time.Comment: Accepted to NeurIPS 202