Search CORE

106 research outputs found

Meta-Reinforcement Learning via Language Instructions

Author: Bing Zhenshan
Huang Kai
Knoll Alois
Koch Alexander
Yao Xiangtong
Publication venue
Publication date: 15/09/2022
Field of study

Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and progresses in the learning only relying on the reward signal. This is implicit and rather insufficient to learn a task well. On the contrary, humans are usually taught new skills via natural language instructions. Utilizing language instructions for robotic motion control to improve the adaptability is a recently emerged topic and challenging. In this paper, we present a meta-RL algorithm that addresses the challenge of learning skills with language instructions in multiple manipulation tasks. On the one hand, our algorithm utilizes the language instructions to shape its interpretation of the task, on the other hand, it still learns to solve task in a trial-and-error process. We evaluate our algorithm on the robotic manipulation benchmark (Meta-World) and it significantly outperforms state-of-the-art methods in terms of training and testing task success rates. Codes are available at \url{https://tumi6robot.wixsite.com/million}

arXiv.org e-Print Archive

Meta-Reinforcement Learning for Adaptive Control of Second Order Systems

Author: Backström Johan U.
Forbes Michael G.
Gopaluni R. Bhushan
Lawrence Nathan P.
Loewen Philip D.
McClement Daniel G.
Publication venue
Publication date: 19/09/2022
Field of study

Meta-learning is a branch of machine learning which aims to synthesize data from a distribution of related tasks to efficiently solve new ones. In process control, many systems have similar and well-understood dynamics, which suggests it is feasible to create a generalizable controller through meta-learning. In this work, we formulate a meta reinforcement learning (meta-RL) control strategy that takes advantage of known, offline information for training, such as a model structure. The meta-RL agent is trained over a distribution of model parameters, rather than a single model, enabling the agent to automatically adapt to changes in the process dynamics while maintaining performance. A key design element is the ability to leverage model-based information offline during training, while maintaining a model-free policy structure for interacting with new environments. Our previous work has demonstrated how this approach can be applied to the industrially-relevant problem of tuning proportional-integral controllers to control first order processes. In this work, we briefly reintroduce our methodology and demonstrate how it can be extended to proportional-integral-derivative controllers and second order systems.Comment: AdCONIP 2022. arXiv admin note: substantial text overlap with arXiv:2203.0966

arXiv.org e-Print Archive

Learning from Symmetry: Meta-Reinforcement Learning with Symmetric Data and Language Instructions

Author: Bing Zhenshan
Chen Kejia
Huang Kai
Knoll Alois
Yao Xiangtong
Zhou Hongkuan
Zhuang Genghang
Publication venue
Publication date: 21/09/2022
Field of study

Meta-reinforcement learning (meta-RL) is a promising approach that enables the agent to learn new tasks quickly. However, most meta-RL algorithms show poor generalization in multiple-task scenarios due to the insufficient task information provided only by rewards. Language-conditioned meta-RL improves the generalization by matching language instructions and the agent's behaviors. Learning from symmetry is an important form of human learning, therefore, combining symmetry and language instructions into meta-RL can help improve the algorithm's generalization and learning efficiency. We thus propose a dual-MDP meta-reinforcement learning method that enables learning new tasks efficiently with symmetric data and language instructions. We evaluate our method in multiple challenging manipulation tasks, and experimental results show our method can greatly improve the generalization and efficiency of meta-reinforcement learning

arXiv.org e-Print Archive