1,988 research outputs found
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
In this paper, we present a hierarchical path planning framework called SG-RL
(subgoal graphs-reinforcement learning), to plan rational paths for agents
maneuvering in continuous and uncertain environments. By "rational", we mean
(1) efficient path planning to eliminate first-move lags; (2) collision-free
and smooth for agents with kinematic constraints satisfied. SG-RL works in a
two-level manner. At the first level, SG-RL uses a geometric path-planning
method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract
paths, also called subgoal sequences. At the second level, SG-RL uses an RL
method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal
motion-planning policies which can generate kinematically feasible and
collision-free trajectories between adjacent subgoals. The first advantage of
the proposed method is that SSG can solve the limitations of sparse reward and
local minima trap for RL agents; thus, LSPI can be used to generate paths in
complex environments. The second advantage is that, when the environment
changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to
reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI
can deal with uncertainties by exploiting its generalization ability to handle
changes in environments. Simulation experiments in representative scenarios
demonstrate that, compared with existing methods, SG-RL can work well on
large-scale maps with relatively low action-switching frequencies and shorter
path lengths, and SG-RL can deal with small changes in environments. We further
demonstrate that the design of reward functions and the types of training
environments are important factors for learning feasible policies.Comment: 20 page
Exploring the Free Energy Landscape: From Dynamics to Networks and Back
The knowledge of the Free Energy Landscape topology is the essential key to
understand many biochemical processes. The determination of the conformers of a
protein and their basins of attraction takes a central role for studying
molecular isomerization reactions. In this work, we present a novel framework
to unveil the features of a Free Energy Landscape answering questions such as
how many meta-stable conformers are, how the hierarchical relationship among
them is, or what the structure and kinetics of the transition paths are.
Exploring the landscape by molecular dynamics simulations, the microscopic data
of the trajectory are encoded into a Conformational Markov Network. The
structure of this graph reveals the regions of the conformational space
corresponding to the basins of attraction. In addition, handling the
Conformational Markov Network, relevant kinetic magnitudes as dwell times or
rate constants, and the hierarchical relationship among basins, complete the
global picture of the landscape. We show the power of the analysis studying a
toy model of a funnel-like potential and computing efficiently the conformers
of a short peptide, the dialanine, paving the way to a systematic study of the
Free Energy Landscape in large peptides.Comment: PLoS Computational Biology (in press
Variational Monte Carlo with the Multi-Scale Entanglement Renormalization Ansatz
Monte Carlo sampling techniques have been proposed as a strategy to reduce
the computational cost of contractions in tensor network approaches to solving
many-body systems. Here we put forward a variational Monte Carlo approach for
the multi-scale entanglement renormalization ansatz (MERA), which is a unitary
tensor network. Two major adjustments are required compared to previous
proposals with non-unitary tensor networks. First, instead of sampling over
configurations of the original lattice, made of L sites, we sample over
configurations of an effective lattice, which is made of just log(L) sites.
Second, the optimization of unitary tensors must account for their unitary
character while being robust to statistical noise, which we accomplish with a
modified steepest descent method within the set of unitary tensors. We
demonstrate the performance of the variational Monte Carlo MERA approach in the
relatively simple context of a finite quantum spin chain at criticality, and
discuss future, more challenging applications, including two dimensional
systems.Comment: 11 pages, 12 figures, a variety of minor clarifications and
correction
Two neural network algorithms for designing optimal terminal controllers with open final time
Multilayer neural networks, trained by the backpropagation through time algorithm (BPTT), have been used successfully as state-feedback controllers for nonlinear terminal control problems. Current BPTT techniques, however, are not able to deal systematically with open final-time situations such as minimum-time problems. Two approaches which extend BPTT to open final-time problems are presented. In the first, a neural network learns a mapping from initial-state to time-to-go. In the second, the optimal number of steps for each trial run is found using a line-search. Both methods are derived using Lagrange multiplier techniques. This theoretical framework is used to demonstrate that the derived algorithms are direct extensions of forward/backward sweep methods used in N-stage optimal control. The two algorithms are tested on a Zermelo problem and the resulting trajectories compare favorably to optimal control results
- …