67,573 research outputs found
Recommended from our members
Action selection in modular reinforcement learning
textModular reinforcement learning is an approach to resolve the curse of dimensionality problem in traditional reinforcement learning. We design and implement a modular reinforcement learning algorithm, which is based on three major components: Markov decision process decomposition, module training, and global action selection. We define and formalize module class and module instance concepts in decomposition step. Under our framework of decomposition, we train each modules efficiently using SARSA() algorithm. Then we design, implement, test, and compare three action selection algorithms based on different heuristics: Module Combination, Module Selection, and Module Voting. For last two algorithms, we propose a method to calculate module weights efficiently, by using standard deviation of Q-values of each module. We show that Module Combination and Module Voting algorithms produce satisfactory performance in our test domain.Computer Science
Recommended from our members
SMART (Stochastic Model Acquisition with ReinforcemenT) learning agents: A preliminary report
We present a framework for building agents that learn using SMART, a system that combines stochastic model acquisition with reinforcement learning to enable an agent to model its environment through experience and subsequently form action selection policies using the acquired model. We extend an existing algorithm for automatic creation of stochastic strips operators [9] as a preliminary method of environment modelling. We then define the process of generation of future states using these operators and an initial state and finally show the process by which the agent can use the generated states to form a policy with a standard reinforcement learning algorithm. The potential of SMART is exemplified using the well-known predator prey scenario. Results of applying SMART to this environment and directions for future work are discussed
Reinforcement Learning for Ramp Control: An Analysis of Learning Parameters
Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestionsabout how to select suitable parameter values that can achieve a superior performance were provided
Reinforcement Learning for Ramp Control: An Analysis of Learning Parameters
Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestionsabout how to select suitable parameter values that can achieve a superior performance were provided
Solving Inverse Problems with Reinforcement Learning
In this paper, we formally introduce, with rigorous derivations, the use of
reinforcement learning to the field of inverse problems by designing an
iterative algorithm, called REINFORCE-IP, for solving a general type of
non-linear inverse problem. By choosing specific probability models for the
action-selection rule, we connect our approach to the conventional
regularization methods of Tikhonov regularization and iterative regularization.
For the numerical implementation of our approach, we parameterize the
solution-searching rule with the help of neural networks and iteratively
improve the parameter using a reinforcement-learning algorithm~-- REINFORCE.
Under standard assumptions we prove the almost sure convergence of the
parameter to a locally optimal value. Our work provides two typical examples
(non-linear integral equations and parameter-identification problems in partial
differential equations) of how reinforcement learning can be applied in solving
non-linear inverse problems. Our numerical experiments show that REINFORCE-IP
is an efficient algorithm that can escape from local minimums and identify
multi-solutions for inverse problems with non-uniqueness.Comment: 33 pages, 10 figure
- …