68,047 research outputs found

    Using Inverse Reinforcement Learning with Real Trajectories to Get More Trustworthy Pedestrian Simulation

    Get PDF
    Reinforcement learning is one of the most promising machine learning techniques to get intelligent behaviors for embodied agents in simulations. The output of the classic Temporal Difference family of Reinforcement Learning algorithms adopts the form of a value function expressed as a numeric table or a function approximator. The learned behavior is then derived using a greedy policy with respect to this value function. Nevertheless, sometimes the learned policy does not meet expectations, and the task of authoring is difficult and unsafe because the modification of one value or parameter in the learned value function has unpredictable consequences in the space of the policies it represents. This invalidates direct manipulation of the learned value function as a method to modify the derived behaviors. In this paper, we propose the use of Inverse Reinforcement Learning to incorporate real behavior traces in the learning process to shape the learned behaviors, thus increasing their trustworthiness (in terms of conformance to reality). To do so, we adapt the Inverse Reinforcement Learning framework to the navigation problem domain. Specifically, we use Soft Q-learning, an algorithm based on the maximum causal entropy principle, with MARL-Ped (a Reinforcement Learning-based pedestrian simulator) to include information from trajectories of real pedestrians in the process of learning how to navigate inside a virtual 3D space that represents the real environment. A comparison with the behaviors learned using a Reinforcement Learning classic algorithm (Sarsa(λ)) shows that the Inverse Reinforcement Learning behaviors adjust significantly better to the real trajectories

    Motor control and strategy discovery for physically simulated characters

    Get PDF
    In physics-based character animation, motions are realized through control of simulated characters along with their interactions with the virtual environment. In this thesis, we study the problem of character control on two levels: joint-level motor control which transforms control signals to joint torques, and high-level motion control which outputs joint-level control signals given the current state of the character and the environment and the task objective. We propose a Modified Articulated-Body Algorithm (MABA) which achieves stable proportional-derivative (PD) low-level motor control with superior theoretical time complexity, practical efficiency and stability than prior implementations. We further propose a high-level motion control framework based on deep reinforcement learning (DRL) which enables the discovery of appropriate motion strategies without human demonstrations to complete a task objective. To facilitate the learning of realistic human motions, we propose a Pose Variational Autoencoder (P-VAE) to constrain the DRL actions to a subspace of natural poses. Our learning framework can be further combined with a sample-efficient Bayesian Diversity Search (BDS) algorithm and novel policy seeking to discover diverse strategies for tasks with multiple modes, such as various athletic jumping tasks