63 research outputs found
Neural Q-learning for solving PDEs
Solving high-dimensional partial differential equations (PDEs) is a major challenge in scientific computing. We develop a new numerical method for solving elliptic-type PDEs by
adapting the Q-learning algorithm in reinforcement learning. To solve PDEs with Dirichlet
boundary condition, our âQ-PDEâ algorithm is mesh-free and therefore has the potential
to overcome the curse of dimensionality. Using a neural tangent kernel (NTK) approach,
we prove that the neural network approximator for the PDE solution, trained with the QPDE algorithm, converges to the trajectory of an infinite-dimensional ordinary differential
equation (ODE) as the number of hidden units â â. For monotone PDEs (i.e. those given
by monotone operators, which may be nonlinear), despite the lack of a spectral gap in the
NTK, we then prove that the limit neural network, which satisfies the infinite-dimensional
ODE, strongly converges in L
2
to the PDE solution as the training time â â. More generally, we can prove that any fixed point of the wide-network limit for the Q-PDE algorithm
is a solution of the PDE (not necessarily under the monotone condition). The numerical
performance of the Q-PDE algorithm is studied for several elliptic PDEs
Neural Q-learning for solving elliptic PDEs
Solving high-dimensional partial differential equations (PDEs) is a major
challenge in scientific computing. We develop a new numerical method for
solving elliptic-type PDEs by adapting the Q-learning algorithm in
reinforcement learning. Our "Q-PDE" algorithm is mesh-free and therefore has
the potential to overcome the curse of dimensionality. Using a neural tangent
kernel (NTK) approach, we prove that the neural network approximator for the
PDE solution, trained with the Q-PDE algorithm, converges to the trajectory of
an infinite-dimensional ordinary differential equation (ODE) as the number of
hidden units . For monotone PDE (i.e. those given by
monotone operators, which may be nonlinear), despite the lack of a spectral gap
in the NTK, we then prove that the limit neural network, which satisfies the
infinite-dimensional ODE, converges in to the PDE solution as the
training time . More generally, we can prove that any fixed
point of the wide-network limit for the Q-PDE algorithm is a solution of the
PDE (not necessarily under the monotone condition). The numerical performance
of the Q-PDE algorithm is studied for several elliptic PDEs
An Improved Neural Q-learning Approach for Dynamic Path Planning of Mobile Robots
Dynamic path planning is an important task for mobile robots in complex and uncertain environments. This paper proposes an improved neural Q-learning (INQL) approach for dynamic path planning of mobile robots. In the proposed INQL approach, the reward function is designed based on a bio-inspired neural network so that the rate of convergence of INQL is improved compared with the Q-learning algorithm. In addition, by combining the INQL algorithm with cubic B-spline curves, the robot can move to the target position successfully via a feasible planned path in different dynamic environments. Simulation and experimental results illustrate the superior of the INQL-based path planning method compared with existing popular path planning methods
EMBEDDED LEARNING ROBOT WITH FUZZY Q-LEARNING FOR OBSTACLE AVOIDANCE BEHAVIOR
Fuzzy Q-learning is extending of Q-learning algorithm that uses fuzzy inference system to enable Q-learning holding continuous action and state. This learning has been implemented in various robot learning application like obstacle avoidance and target searching. However, most of them have not been realized in embedded robot. This paper presents implementation of fuzzy Q-learning for obstacle avoidance navigation in embedded mobile robot. The experimental result demonstrates that fuzzy Q-learning enables robot to be able to learn the right policy i.e. to avoid obstacle
BEHAVIOR BASED CONTROL AND FUZZY Q-LEARNING FOR AUTONOMOUS FIVE LEGS ROBOT NAVIGATION
This paper presents collaboration of behavior based control and fuzzy Q-learning for five legs robot navigation systems. There are many fuzzy Q-learning algorithms that have been proposed to yield individual behavior like obstacle avoidance, find target and so on. However, for complicated tasks, it is needed to combine all behaviors in one control schema using behavior based control. Based this fact, this paper proposes a control schema that incorporate fuzzy q-learning in behavior based schema to overcome complicated tasks in navigation systems of autonomous five legs robot. In the proposed schema, there are two behaviors which is learned by fuzzy q-learning. Other behaviors is constructed in design step. All behaviors are coordinated by hierarchical hybrid coordination node. Simulation results demonstrate that the robot with proposed schema is able to learn the right policy, to avoid obstacle and to find the target. However, Fuzzy q-learning failed to give right policy for the robot to avoid collision in the corner location. Keywords : behavior based control, fuzzy q-learnin
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Using deep neural nets as function approximator for reinforcement learning
tasks have recently been shown to be very powerful for solving problems
approaching real-world complexity. Using these results as a benchmark, we
discuss the role that the discount factor may play in the quality of the
learning process of a deep Q-network (DQN). When the discount factor
progressively increases up to its final value, we empirically show that it is
possible to significantly reduce the number of learning steps. When used in
conjunction with a varying learning rate, we empirically show that it
outperforms original DQN on several experiments. We relate this phenomenon with
the instabilities of neural networks when they are used in an approximate
Dynamic Programming setting. We also describe the possibility to fall within a
local optimum during the learning process, thus connecting our discussion with
the exploration/exploitation dilemma.Comment: NIPS 2015 Deep Reinforcement Learning Worksho
A Comparison of the Performance of Neural Q-learning and Soar-RL on a Derivative of the Block Design (BD)/Block Design Multiple Choice (BDMC) Subtests on the WISC-IV Intelligence Test
Teaching an autonomous agent to perform tasks that are simple to humans can be complex, especially when the task requires successive steps, has a low likelihood of successful completion with a brute force approach, and when the solution space is too large or too complex to be explicitly encoded. Reinforcement learning algorithms are particularly suited to such situations, and are based on rewards that help the agent to find the optimal action to execute given a certain state. The task investigated in this thesis is a modified form of the Block Design (BD) and Block Design Multiple Choice (BDMC) subtests, used by the Fourth Edition of the Wechsler Intelligence Scale for Children (WISC-IV) to partially assess childrens\u27 learning abilities. This thesis investigates the implementation, training, and performance of two reinforcement learning architectures for this problem: Soar-RL, a production system capable of reinforcement learning, and a Q-learning neural network. The objective is to help define the advantages and disadvantages of solving problems using these architectures. This thesis will show that Soar is intuitive for implementation and is able to find an optimal policy, although it is limited by its execution of exploratory actions. The neural network is also able to find an optimal policy and outperforms Soar, but the convergence of the solution is highly dependent on the architecture of the neural network
- âŚ