6,448 research outputs found
Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis
This work handles the inverse reinforcement learning (IRL) problem where only
a small number of demonstrations are available from a demonstrator for each
high-dimensional task, insufficient to estimate an accurate reward function.
Observing that each demonstrator has an inherent reward for each state and the
task-specific behaviors mainly depend on a small number of key states, we
propose a meta IRL algorithm that first models the reward function for each
task as a distribution conditioned on a baseline reward function shared by all
tasks and dependent only on the demonstrator, and then finds the most likely
reward function in the distribution that explains the task-specific behaviors.
We test the method in a simulated environment on path planning tasks with
limited demonstrations, and show that the accuracy of the learned reward
function is significantly improved. We also apply the method to analyze the
motion of a patient under rehabilitation.Comment: arXiv admin note: text overlap with arXiv:1707.0939
Inverse Reinforcement Learning in Large State Spaces via Function Approximation
This paper introduces a new method for inverse reinforcement learning in
large-scale and high-dimensional state spaces. To avoid solving the
computationally expensive reinforcement learning problems in reward learning,
we propose a function approximation method to ensure that the Bellman
Optimality Equation always holds, and then estimate a function to maximize the
likelihood of the observed motion. The time complexity of the proposed method
is linearly proportional to the cardinality of the action set, thus it can
handle large state spaces efficiently. We test the proposed method in a
simulated environment, and show that it is more accurate than existing methods
and significantly better in scalability. We also show that the proposed method
can extend many existing methods to high-dimensional state spaces. We then
apply the method to evaluating the effect of rehabilitative stimulations on
patients with spinal cord injuries based on the observed patient motions.Comment: Experiment update
- …