142,439 research outputs found
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
In reinforcement learning, domain randomisation is an increasingly popular
technique for learning more general policies that are robust to domain-shifts
at deployment. However, naively aggregating information from randomised domains
may lead to high variance in gradient estimation and unstable learning process.
To address this issue, we present a peer-to-peer online distillation strategy
for RL termed P2PDRL, where multiple workers are each assigned to a different
environment, and exchange knowledge through mutual regularisation based on
Kullback-Leibler divergence. Our experiments on continuous control tasks show
that P2PDRL enables robust learning across a wider randomisation distribution
than baselines, and more robust generalisation to new environments at testing
An architecture for designing fuzzy logic controllers using neural networks
Described here is an architecture for designing fuzzy controllers through a hierarchical process of control rule acquisition and by using special classes of neural network learning techniques. A new method for learning to refine a fuzzy logic controller is introduced. A reinforcement learning technique is used in conjunction with a multi-layer neural network model of a fuzzy controller. The model learns by updating its prediction of the plant's behavior and is related to the Sutton's Temporal Difference (TD) method. The method proposed here has the advantage of using the control knowledge of an experienced operator and fine-tuning it through the process of learning. The approach is applied to a cart-pole balancing system
- …