2,319 research outputs found
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
Proximal Policy Optimization with Relative Pearson Divergence
The recent remarkable progress of deep reinforcement learning (DRL) stands on
regularization of policy for stable and efficient learning. A popular method,
named proximal policy optimization (PPO), has been introduced for this purpose.
PPO clips density ratio of the latest and baseline policies with a threshold,
while its minimization target is unclear. As another problem of PPO, the
symmetric threshold is given numerically while the density ratio itself is in
asymmetric domain, thereby causing unbalanced regularization of the policy.
This paper therefore proposes a new variant of PPO by considering a
regularization problem of relative Pearson (RPE) divergence, so-called PPO-RPE.
This regularization yields the clear minimization target, which constrains the
latest policy to the baseline one. Through its analysis, the intuitive
threshold-based design consistent with the asymmetry of the threshold and the
domain of density ratio can be derived. Through four benchmark tasks, PPO-RPE
performed as well as or better than the conventional methods in terms of the
task performance by the learned policy.Comment: 6 pages, 5 figures (accepted for ICRA2021
A Review on Robot Manipulation Methods in Human-Robot Interactions
Robot manipulation is an important part of human-robot interaction
technology. However, traditional pre-programmed methods can only accomplish
simple and repetitive tasks. To enable effective communication between robots
and humans, and to predict and adapt to uncertain environments, this paper
reviews recent autonomous and adaptive learning in robotic manipulation
algorithms. It includes typical applications and challenges of human-robot
interaction, fundamental tasks of robot manipulation and one of the most widely
used formulations of robot manipulation, Markov Decision Process. Recent
research focusing on robot manipulation is mainly based on Reinforcement
Learning and Imitation Learning. This review paper shows the importance of Deep
Reinforcement Learning, which plays an important role in manipulating robots to
complete complex tasks in disturbed and unfamiliar environments. With the
introduction of Imitation Learning, it is possible for robot manipulation to
get rid of reward function design and achieve a simple, stable and supervised
learning process. This paper reviews and compares the main features and popular
algorithms for both Reinforcement Learning and Imitation Learning
Advances in Reinforcement Learning
Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic
Student Behavior Simulation in English Online Education Based on Reinforcement Learning
In class, every student's action is not the same. In this era, most courses are taken online; tracking and identifying students’ behavior is a significant challenge, especially in language classes (English). In this study, Student Behaviors’ Simulation-Based on Reinforcement Learning Framework (SBS–BRLF) has been proposed to track and identify students’ online class behavior. The simulation model is generated with various trained sets of behavior that are categorized as positive and negative with Reinforcement Learning (RL). Reinforcement learning (RL) is a field of machine learning dealing with how intelligent agents act in an environment for cumulative rewards. With a web camera and microphone, the students are tracked in the simulation model, and collected data is executed with RL’s aid. If the action is assessed as good, the pupil is praised, or given a warning three times, and then, if repeated, suspended for a day. Hence, the pupil is monitored easily without complications. The research and comparative analysis of the proposed and the current framework have proved that SBSBRLF works efficiently and accurately with the behavioral rate of 93.2%, the performance rate of 96%, supervision rate of 92%, reliability rate of 89.7 % for students, and a higher action and reward acceptance rate of 89.9 %
- …