5,884 research outputs found
Free energy reconstruction from steered dynamics without post-processing
Various methods achieving importance sampling in ensembles of nonequilibrium
trajectories enable to estimate free energy differences and, by
maximum-likelihood post-processing, to reconstruct free energy landscapes.
Here, based on Bayes theorem, we propose a more direct method in which a
posterior likelihood function is used both to construct the steered dynamics
and to infer the contribution to equilibrium of all the sampled states. The
method is implemented with two steering schedules. First, using non-autonomous
steering, we calculate the migration barrier of the vacancy in Fe-alpha.
Second, using an autonomous scheduling related to metadynamics and equivalent
to temperature-accelerated molecular dynamics, we accurately reconstruct the
two-dimensional free energy landscape of the 38-atom Lennard-Jones cluster as a
function of an orientational bond-order parameter and energy, down to the
solid-solid structural transition temperature of the cluster and without
maximum-likelihood post-processing.Comment: Accepted manuscript in Journal of Computational Physics, 7 figure
Active actuator fault-tolerant control of a wind turbine benchmark model
This paper describes the design of an active fault-tolerant control scheme that is applied to the actuator of a
wind turbine benchmark. The methodology is based on adaptive filters obtained via the nonlinear geometric
approach, which allows to obtain interesting decoupling property with respect to uncertainty affecting the
wind turbine system. The controller accommodation scheme exploits the on-line estimate of the actuator
fault signal generated by the adaptive filters. The nonlinearity of the wind turbine model is described by the
mapping to the power conversion ratio from tip-speed ratio and blade pitch angles. This mapping represents
the aerodynamic uncertainty, and usually is not known in analytical form, but in general represented by
approximated two-dimensional maps (i.e. look-up tables). Therefore, this paper suggests a scheme to
estimate this power conversion ratio in an analytical form by means of a two-dimensional polynomial, which
is subsequently used for designing the active fault-tolerant control scheme. The wind turbine power generating
unit of a grid is considered as a benchmark to show the design procedure, including the aspects of
the nonlinear disturbance decoupling method, as well as the viability of the proposed approach. Extensive
simulations of the benchmark process are practical tools for assessing experimentally the features of the
developed actuator fault-tolerant control scheme, in the presence of modelling and measurement errors.
Comparisons with different fault-tolerant schemes serve to highlight the advantages and drawbacks of the
proposed methodology
Belief State Planning for Autonomously Navigating Urban Intersections
Urban intersections represent a complex environment for autonomous vehicles
with many sources of uncertainty. The vehicle must plan in a stochastic
environment with potentially rapid changes in driver behavior. Providing an
efficient strategy to navigate through urban intersections is a difficult task.
This paper frames the problem of navigating unsignalized intersections as a
partially observable Markov decision process (POMDP) and solves it using a
Monte Carlo sampling method. Empirical results in simulation show that the
resulting policy outperforms a threshold-based heuristic strategy on several
relevant metrics that measure both safety and efficiency.Comment: 6 pages, 6 figures, accepted to IV201
The Coordinate Particle Filter - A novel Particle Filter for High Dimensional Systems
Parametric filters, such as the Extended Kalman Filter and the Unscented
Kalman Filter, typically scale well with the dimensionality of the problem, but
they are known to fail if the posterior state distribution cannot be closely
approximated by a density of the assumed parametric form. For nonparametric
filters, such as the Particle Filter, the converse holds. Such methods are able
to approximate any posterior, but the computational requirements scale
exponentially with the number of dimensions of the state space. In this paper,
we present the Coordinate Particle Filter which alleviates this problem. We
propose to compute the particle weights recursively, dimension by dimension.
This allows us to explore one dimension at a time, and resample after each
dimension if necessary. Experimental results on simulated as well as real data
confirm that the proposed method has a substantial performance advantage over
the Particle Filter in high-dimensional systems where not all dimensions are
highly correlated. We demonstrate the benefits of the proposed method for the
problem of multi-object and robotic manipulator tracking
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Learning strategies for imperfect information games from samples of
interaction is a challenging problem. A common method for this setting, Monte
Carlo Counterfactual Regret Minimization (MCCFR), can have slow long-term
convergence rates due to high variance. In this paper, we introduce a variance
reduction technique (VR-MCCFR) that applies to any sampling variant of MCCFR.
Using this technique, per-iteration estimated values and updates are
reformulated as a function of sampled values and state-action baselines,
similar to their use in policy gradient reinforcement learning. The new
formulation allows estimates to be bootstrapped from other estimates within the
same episode, propagating the benefits of baselines along the sampled
trajectory; the estimates remain unbiased even when bootstrapping from other
estimates. Finally, we show that given a perfect baseline, the variance of the
value estimates can be reduced to zero. Experimental evaluation shows that
VR-MCCFR brings an order of magnitude speedup, while the empirical variance
decreases by three orders of magnitude. The decreased variance allows for the
first time CFR+ to be used with sampling, increasing the speedup to two orders
of magnitude
Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving
Tactical decision making for autonomous driving is challenging due to the
diversity of environments, the uncertainty in the sensor information, and the
complex interaction with other road users. This paper introduces a general
framework for tactical decision making, which combines the concepts of planning
and learning, in the form of Monte Carlo tree search and deep reinforcement
learning. The method is based on the AlphaGo Zero algorithm, which is extended
to a domain with a continuous state space where self-play cannot be used. The
framework is applied to two different highway driving cases in a simulated
environment and it is shown to perform better than a commonly used baseline
method. The strength of combining planning and learning is also illustrated by
a comparison to using the Monte Carlo tree search or the neural network policy
separately
- …