61 research outputs found
Stochastic Inverse Reinforcement Learning
The goal of the inverse reinforcement learning (IRL) problem is to recover
the reward functions from expert demonstrations. However, the IRL problem like
any ill-posed inverse problem suffers the congenital defect that the policy may
be optimal for many reward functions, and expert demonstrations may be optimal
for many policies. In this work, we generalize the IRL problem to a well-posed
expectation optimization problem stochastic inverse reinforcement learning
(SIRL) to recover the probability distribution over reward functions. We adopt
the Monte Carlo expectation-maximization (MCEM) method to estimate the
parameter of the probability distribution as the first solution to the SIRL
problem. The solution is succinct, robust, and transferable for a learning task
and can generate alternative solutions to the IRL problem. Through our
formulation, it is possible to observe the intrinsic property for the IRL
problem from a global viewpoint, and our approach achieves a considerable
performance on the objectworld.Comment: 8+2 pages, 5 figures, Under Revie
Social Attention: Modeling Attention in Human Crowds
Robots that navigate through human crowds need to be able to plan safe,
efficient, and human predictable trajectories. This is a particularly
challenging problem as it requires the robot to predict future human
trajectories within a crowd where everyone implicitly cooperates with each
other to avoid collisions. Previous approaches to human trajectory prediction
have modeled the interactions between humans as a function of proximity.
However, that is not necessarily true as some people in our immediate vicinity
moving in the same direction might not be as important as other people that are
further away, but that might collide with us in the future. In this work, we
propose Social Attention, a novel trajectory prediction model that captures the
relative importance of each person when navigating in the crowd, irrespective
of their proximity. We demonstrate the performance of our method against a
state-of-the-art approach on two publicly available crowd datasets and analyze
the trained attention model to gain a better understanding of which surrounding
agents humans attend to, when navigating in a crowd
- …