5,884 research outputs found

    Free energy reconstruction from steered dynamics without post-processing

    Full text link
    Various methods achieving importance sampling in ensembles of nonequilibrium trajectories enable to estimate free energy differences and, by maximum-likelihood post-processing, to reconstruct free energy landscapes. Here, based on Bayes theorem, we propose a more direct method in which a posterior likelihood function is used both to construct the steered dynamics and to infer the contribution to equilibrium of all the sampled states. The method is implemented with two steering schedules. First, using non-autonomous steering, we calculate the migration barrier of the vacancy in Fe-alpha. Second, using an autonomous scheduling related to metadynamics and equivalent to temperature-accelerated molecular dynamics, we accurately reconstruct the two-dimensional free energy landscape of the 38-atom Lennard-Jones cluster as a function of an orientational bond-order parameter and energy, down to the solid-solid structural transition temperature of the cluster and without maximum-likelihood post-processing.Comment: Accepted manuscript in Journal of Computational Physics, 7 figure

    Active actuator fault-tolerant control of a wind turbine benchmark model

    Get PDF
    This paper describes the design of an active fault-tolerant control scheme that is applied to the actuator of a wind turbine benchmark. The methodology is based on adaptive filters obtained via the nonlinear geometric approach, which allows to obtain interesting decoupling property with respect to uncertainty affecting the wind turbine system. The controller accommodation scheme exploits the on-line estimate of the actuator fault signal generated by the adaptive filters. The nonlinearity of the wind turbine model is described by the mapping to the power conversion ratio from tip-speed ratio and blade pitch angles. This mapping represents the aerodynamic uncertainty, and usually is not known in analytical form, but in general represented by approximated two-dimensional maps (i.e. look-up tables). Therefore, this paper suggests a scheme to estimate this power conversion ratio in an analytical form by means of a two-dimensional polynomial, which is subsequently used for designing the active fault-tolerant control scheme. The wind turbine power generating unit of a grid is considered as a benchmark to show the design procedure, including the aspects of the nonlinear disturbance decoupling method, as well as the viability of the proposed approach. Extensive simulations of the benchmark process are practical tools for assessing experimentally the features of the developed actuator fault-tolerant control scheme, in the presence of modelling and measurement errors. Comparisons with different fault-tolerant schemes serve to highlight the advantages and drawbacks of the proposed methodology

    Belief State Planning for Autonomously Navigating Urban Intersections

    Full text link
    Urban intersections represent a complex environment for autonomous vehicles with many sources of uncertainty. The vehicle must plan in a stochastic environment with potentially rapid changes in driver behavior. Providing an efficient strategy to navigate through urban intersections is a difficult task. This paper frames the problem of navigating unsignalized intersections as a partially observable Markov decision process (POMDP) and solves it using a Monte Carlo sampling method. Empirical results in simulation show that the resulting policy outperforms a threshold-based heuristic strategy on several relevant metrics that measure both safety and efficiency.Comment: 6 pages, 6 figures, accepted to IV201

    The Coordinate Particle Filter - A novel Particle Filter for High Dimensional Systems

    Full text link
    Parametric filters, such as the Extended Kalman Filter and the Unscented Kalman Filter, typically scale well with the dimensionality of the problem, but they are known to fail if the posterior state distribution cannot be closely approximated by a density of the assumed parametric form. For nonparametric filters, such as the Particle Filter, the converse holds. Such methods are able to approximate any posterior, but the computational requirements scale exponentially with the number of dimensions of the state space. In this paper, we present the Coordinate Particle Filter which alleviates this problem. We propose to compute the particle weights recursively, dimension by dimension. This allows us to explore one dimension at a time, and resample after each dimension if necessary. Experimental results on simulated as well as real data confirm that the proposed method has a substantial performance advantage over the Particle Filter in high-dimensional systems where not all dimensions are highly correlated. We demonstrate the benefits of the proposed method for the problem of multi-object and robotic manipulator tracking

    Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

    Full text link
    Learning strategies for imperfect information games from samples of interaction is a challenging problem. A common method for this setting, Monte Carlo Counterfactual Regret Minimization (MCCFR), can have slow long-term convergence rates due to high variance. In this paper, we introduce a variance reduction technique (VR-MCCFR) that applies to any sampling variant of MCCFR. Using this technique, per-iteration estimated values and updates are reformulated as a function of sampled values and state-action baselines, similar to their use in policy gradient reinforcement learning. The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates. Finally, we show that given a perfect baseline, the variance of the value estimates can be reduced to zero. Experimental evaluation shows that VR-MCCFR brings an order of magnitude speedup, while the empirical variance decreases by three orders of magnitude. The decreased variance allows for the first time CFR+ to be used with sampling, increasing the speedup to two orders of magnitude

    Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving

    Full text link
    Tactical decision making for autonomous driving is challenging due to the diversity of environments, the uncertainty in the sensor information, and the complex interaction with other road users. This paper introduces a general framework for tactical decision making, which combines the concepts of planning and learning, in the form of Monte Carlo tree search and deep reinforcement learning. The method is based on the AlphaGo Zero algorithm, which is extended to a domain with a continuous state space where self-play cannot be used. The framework is applied to two different highway driving cases in a simulated environment and it is shown to perform better than a commonly used baseline method. The strength of combining planning and learning is also illustrated by a comparison to using the Monte Carlo tree search or the neural network policy separately
    • …
    corecore