20 research outputs found
Effect of Adapting to Human Preferences on Trust in Human-Robot Teaming
We present the effect of adapting to human preferences on trust in a
human-robot teaming task. The team performs a task in which the robot acts as
an action recommender to the human. It is assumed that the behavior of the
human and the robot is based on some reward function they try to optimize. We
use a new human trust-behavior model that enables the robot to learn and adapt
to the human's preferences in real-time during their interaction using Bayesian
Inverse Reinforcement Learning. We present three strategies for the robot to
interact with a human: a non-learner strategy, in which the robot assumes that
the human's reward function is the same as the robot's, a non-adaptive learner
strategy that learns the human's reward function for performance estimation,
but still optimizes its own reward function, and an adaptive-learner strategy
that learns the human's reward function for performance estimation and also
optimizes this learned reward function. Results show that adapting to the
human's reward function results in the highest trust in the robot.Comment: 6 pages, 6 figures, AAAI Fall Symposium on Agent Teaming in
Mixed-Motive Situation
Evaluating the Impact of Personalized Value Alignment in Human-Robot Interaction: Insights into Trust and Team Performance Outcomes
This paper examines the effect of real-time, personalized alignment of a
robot's reward function to the human's values on trust and team performance. We
present and compare three distinct robot interaction strategies: a non-learner
strategy where the robot presumes the human's reward function mirrors its own,
a non-adaptive-learner strategy in which the robot learns the human's reward
function for trust estimation and human behavior modeling, but still optimizes
its own reward function, and an adaptive-learner strategy in which the robot
learns the human's reward function and adopts it as its own. Two human-subject
experiments with a total number of 54 participants were conducted. In both
experiments, the human-robot team searches for potential threats in a town. The
team sequentially goes through search sites to look for threats. We model the
interaction between the human and the robot as a trust-aware Markov Decision
Process (trust-aware MDP) and use Bayesian Inverse Reinforcement Learning (IRL)
to estimate the reward weights of the human as they interact with the robot. In
Experiment 1, we start our learning algorithm with an informed prior of the
human's values/goals. In Experiment 2, we start the learning algorithm with an
uninformed prior. Results indicate that when starting with a good informed
prior, personalized value alignment does not seem to benefit trust or team
performance. On the other hand, when an informed prior is unavailable,
alignment to the human's values leads to high trust and higher perceived
performance while maintaining the same objective team performance.Comment: 10 pages, 9 figures, to be published in ACM/IEEE International
Conference on Human Robot Interaction. arXiv admin note: text overlap with
arXiv:2309.0517