Search CORE

25 research outputs found

Meta-Reinforcement Learning via Language Instructions

Author: Bing Zhenshan
Huang Kai
Knoll Alois
Koch Alexander
Yao Xiangtong
Publication venue
Publication date: 15/09/2022
Field of study

Although deep reinforcement learning has recently been very successful at learning complex behaviors, it requires a tremendous amount of data to learn a task. One of the fundamental reasons causing this limitation lies in the nature of the trial-and-error learning paradigm of reinforcement learning, where the agent communicates with the environment and progresses in the learning only relying on the reward signal. This is implicit and rather insufficient to learn a task well. On the contrary, humans are usually taught new skills via natural language instructions. Utilizing language instructions for robotic motion control to improve the adaptability is a recently emerged topic and challenging. In this paper, we present a meta-RL algorithm that addresses the challenge of learning skills with language instructions in multiple manipulation tasks. On the one hand, our algorithm utilizes the language instructions to shape its interpretation of the task, on the other hand, it still learns to solve task in a trial-and-error process. We evaluate our algorithm on the robotic manipulation benchmark (Meta-World) and it significantly outperforms state-of-the-art methods in terms of training and testing task success rates. Codes are available at \url{https://tumi6robot.wixsite.com/million}

arXiv.org e-Print Archive

Editorial: Neuromorphic engineering for robotics

Author: Bing Zhenshan
Knoll Alois
Yang Chenguang
Publication venue: 'Frontiers Media SA'
Publication date: 28/02/2023
Field of study

Neuromorphic engineering aims to apply insights from neurobiology to develop next-generation artificial intelligence for computation, sensing, and the control of robotic systems. There has been a rapid expansion of neuromorphic engineering technologies for robotics due to several developments. First, the success and limitation of deep neural networks has greatly increased the belief that biological intelligence can further boost the computing performance of artificial intelligence in terms of data, power, and computing efficiency. Second, the emergence of novel neuromorphic hardware and sensors has shown greater application-level performance compared with conventional CPUs and GPUs. Third, the pace of progress in neuroscience has accelerated dramatically in recent years, providing a wealth of new understanding and insights regarding the functioning of brains at the neuron level. Therefore, neuromorphic engineering can represent a fundamental revolution for robotics in many ways. We have published this Research Topic to collect theoretical and experimental results regarding neuromorphic engineering technologies for the design, control, and real-world applications of robotic systems. After carefully and professionally reviewing all submissions, four high-quality manuscripts were accepted. These articles are reviewed below.Feldotto et al. propose a novel framework to examine the control of biomechanics using physics simulations informed by electromyography (EMG) data. These signals drive a virtual musculoskeletal model in the Neurorobotics Platform (NRP), which is then used to evaluate resulting joint torques. They use their framework to analyze raw EMG data collected during an isometric knee extension study to identify synergies that drive a musculoskeletal lower limb model. The NRP forms a highly modular integrated simulation platform that allows these in silico experiments. Their framework allows research of the neurobiomechanical control of muscles during tasks, which would otherwise not be possible. Gu et al. propose a novel American sign language (ASL) translation method based on wearable sensors. By leveraging the initial sensors to capture signs and surface electromyography (EMG) sensors to detect facial expressions, they can extract features from input signals. The encouraging results indicate that the proposed models are suitable for highly accurate sign language translation. With complete motion capture sensors and facial expression recognition methods, the sign language translation system has the potential to recognize more sentences. Ehrlich et al. demonstrate a neuromorphic adaptive control of a wheelchair-mounted robotic arm deployed on Intel's Loihi chip. The proposed controller provides the robotic arm with adaptive signals, guiding its motion while accounting for kinematic changes in real time. They further demonstrate the capacity of the controller to compensate for unexpected inertia-generating payloads using online learning. Akl et al. show how SNNs can be applied to different DRL algorithms, such as the deep Q-network (DQN) and the twin-delayed deep deterministic policy gradient (TD3), for discrete and continuous action space environments, respectively. They show that randomizing the membrane parameters, instead of selecting uniform values for all neurons, has stabilizing effects on the training. They conclude that SNNs can be used for learning complex continuous control problems with state-of-the-art DRL algorithms.Overall, we hope that this Research Topic can provide some references and novel ideas for the study of neuromorphic robotics

UWE Bristol Research Repository

Learning from Symmetry: Meta-Reinforcement Learning with Symmetric Data and Language Instructions

Author: Bing Zhenshan
Chen Kejia
Huang Kai
Knoll Alois
Yao Xiangtong
Zhou Hongkuan
Zhuang Genghang
Publication venue
Publication date: 21/09/2022
Field of study

Meta-reinforcement learning (meta-RL) is a promising approach that enables the agent to learn new tasks quickly. However, most meta-RL algorithms show poor generalization in multiple-task scenarios due to the insufficient task information provided only by rewards. Language-conditioned meta-RL improves the generalization by matching language instructions and the agent's behaviors. Learning from symmetry is an important form of human learning, therefore, combining symmetry and language instructions into meta-RL can help improve the algorithm's generalization and learning efficiency. We thus propose a dual-MDP meta-reinforcement learning method that enables learning new tasks efficiently with symmetric data and language instructions. We evaluate our method in multiple challenging manipulation tasks, and experimental results show our method can greatly improve the generalization and efficiency of meta-reinforcement learning

arXiv.org e-Print Archive

DIVA: A Dirichlet Process Based Incremental Deep Clustering Algorithm via Variational Auto-Encoder

Author: Bing Zhenshan
Huang Kai
Knoll Alois
Meng Yuan
Su Hang
Su Xiaojie
Yun Yuqi
Publication venue
Publication date: 12/06/2023
Field of study

Generative model-based deep clustering frameworks excel in classifying complex data, but are limited in handling dynamic and complex features because they require prior knowledge of the number of clusters. In this paper, we propose a nonparametric deep clustering framework that employs an infinite mixture of Gaussians as a prior. Our framework utilizes a memoized online variational inference method that enables the "birth" and "merge" moves of clusters, allowing our framework to cluster data in a "dynamic-adaptive" manner, without requiring prior knowledge of the number of features. We name the framework as DIVA, a Dirichlet Process-based Incremental deep clustering framework via Variational Auto-Encoder. Our framework, which outperforms state-of-the-art baselines, exhibits superior performance in classifying complex data with dynamically changing features, particularly in the case of incremental features. We released our source code implementation at: https://github.com/Ghiara/divaComment: update supplementary material

arXiv.org e-Print Archive

Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor

Author: Bing Zhenshan
Chen Kejia
Huang Kai
Knoll Alois
Mavrichev Aleksandr
Shen Sicong
Yao Xiangtong
Publication venue
Publication date: 18/04/2023
Field of study

Deep reinforcement learning (RL) has been endowed with high expectations in tackling challenging manipulation tasks in an autonomous and self-directed fashion. Despite the significant strides made in the development of reinforcement learning, the practical deployment of this paradigm is hindered by at least two barriers, namely, the engineering of a reward function and ensuring the safety guaranty of learning-based controllers. In this paper, we address these challenging limitations by proposing a framework that merges a reinforcement learning \lstinline[columns=fixed]{planner} that is trained using sparse rewards with a model predictive controller (MPC) \lstinline[columns=fixed]{actor}, thereby offering a safe policy. On the one hand, the RL \lstinline[columns=fixed]{planner} learns from sparse rewards by selecting intermediate goals that are easy to achieve in the short term and promising to lead to target goals in the long term. On the other hand, the MPC \lstinline[columns=fixed]{actor} takes the suggested intermediate goals from the RL \lstinline[columns=fixed]{planner} as the input and predicts how the robot's action will enable it to reach that goal while avoiding any obstacles over a short period of time. We evaluated our method on four challenging manipulation tasks with dynamic obstacles and the results demonstrate that, by leveraging the complementary strengths of these two components, the agent can solve manipulation tasks in complex, dynamic environments safely with a

100\%

success rate. Videos are available at \url{https://videoviewsite.wixsite.com/mpc-hgg}

arXiv.org e-Print Archive

Retina-Based Pipe-Like Object Tracking Implemented Through Spiking Neural Network on a Snake Robot

Author: Alois Knoll
Kai Huang
Zhenshan Bing
Zhuangyi Jiang
Publication venue: 'Frontiers Media SA'
Publication date: 01/05/2019
Field of study

Vision based-target tracking ability is crucial to bio-inspired snake robots for exploring unknown environments. However, it is difficult for the traditional vision modules of snake robots to overcome the image blur resulting from periodic swings. A promising approach is to use a neuromorphic vision sensor (NVS), which mimics the biological retina to detect a target at a higher temporal frequency and in a wider dynamic range. In this study, an NVS and a spiking neural network (SNN) were performed on a snake robot for the first time to achieve pipe-like object tracking. An SNN based on Hough Transform was designed to detect a target with an asynchronous event stream fed by the NVS. Combining the state of snake motion analyzed by the joint position sensors, a tracking framework was proposed. The experimental results obtained from the simulator demonstrated the validity of our framework and the autonomous locomotion ability of our snake robot. Comparing the performances of the SNN model on CPUs and on GPUs, respectively, the SNN model showed the best performance on a GPU under a simplified and synchronous update rule while it possessed higher precision on a CPU in an asynchronous way

Directory of Open Access Journals

Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data

Author: Bing Zhenshan
Huang Kai
Knoll Alois
Su Xiaojie
Yang Chenguang
Yao Xiangtong
Zhou Hongkuan
Publication venue
Publication date: 30/05/2023
Field of study

The growing interest in language-conditioned robot manipulation aims to develop robots capable of understanding and executing complex tasks, with the objective of enabling robots to interpret language commands and manipulate objects accordingly. While language-conditioned approaches demonstrate impressive capabilities for addressing tasks in familiar environments, they encounter limitations in adapting to unfamiliar environment settings. In this study, we propose a general-purpose, language-conditioned approach that combines base skill priors and imitation learning under unstructured data to enhance the algorithm's generalization in adapting to unfamiliar environments. We assess our model's performance in both simulated and real-world environments using a zero-shot setting. In the simulated environment, the proposed approach surpasses previously reported scores for CALVIN benchmark, especially in the challenging Zero-Shot Multi-Environment setting. The average completed task length, indicating the average number of tasks the agent can continuously complete, improves more than 2.5 times compared to the state-of-the-art method HULC. In addition, we conduct a zero-shot evaluation of our policy in a real-world setting, following training exclusively in simulated environments without additional specific adaptations. In this evaluation, we set up ten tasks and achieved an average 30% improvement in our approach compared to the current state-of-the-art approach, demonstrating a high generalization capability in both simulated environments and the real world. For further details, including access to our code and videos, please refer to our supplementary materials

arXiv.org e-Print Archive