1 research outputs found
Socially-Aware Robot Planning via Bandit Human Feedback
In this paper, we consider the problem of designing collision-free,
dynamically feasible, and socially-aware trajectories for robots operating in
environments populated by humans. We define trajectories to be social-aware if
they do not interfere with humans in any way that causes discomfort. In this
paper, discomfort is defined broadly and, depending on specific individuals, it
can result from the robot being too close to a human or from interfering with
human sight or tasks. Moreover, we assume that human feedback is a bandit
feedback indicating a complaint or no complaint on the part of the robot
trajectory that interferes with the humans, and it does not reveal any
contextual information about the locations of the humans or the reason for a
complaint. Finally, we assume that humans can move in the obstacle-free space
and, as a result, human utility can change. We formulate this planning problem
as an online optimization problem that minimizes the social value of the
time-varying robot trajectory, defined by the total number of incurred human
complaints. As the human utility is unknown, we employ zeroth order, or
derivative-free, optimization methods to solve this problem, which we combine
with off-the-shelf motion planners to satisfy the dynamic feasibility and
collision-free specifications of the resulting trajectories. To the best of our
knowledge, this is a new framework for socially-aware robot planning that is
not restricted to avoiding collisions with humans but, instead, focuses on
increasing the social value of the robot trajectories using only bandit human
feedback.Comment: 10 pages, 3 figure