3,288 research outputs found

    Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

    Full text link
    The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

    Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

    Full text link
    Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications. Fuzzy controllers have been used in industry for decades as interpretable and efficient system controllers. In this study, we introduce a fuzzy genetic programming (GP) approach called fuzzy GP reinforcement learning (FGPRL) that can select the relevant state features, determine the size of the required fuzzy rule set, and automatically adjust all the controller parameters simultaneously. Each GP individual's fitness is computed using model-based batch reinforcement learning (RL), which first trains a model using available system samples and subsequently performs Monte Carlo rollouts to predict each policy candidate's performance. We compare FGPRL to an extended version of a related method called fuzzy particle swarm reinforcement learning (FPSRL), which uses swarm intelligence to tune the fuzzy policy parameters. Experiments using an industrial benchmark show that FGPRL is able to autonomously learn interpretable fuzzy policies with high control performance.Comment: Accepted at Genetic and Evolutionary Computation Conference 2018 (GECCO '18

    Multi-objective climb path optimization for aircraft/engine integration using Particle Swarm Optimization

    Get PDF
    In this article, a new multi-objective approach to the aircraft climb path optimization problem, based on the Particle Swarm Optimization algorithm, is introduced to be used for aircraft–engine integration studies. This considers a combination of a simulation with a traditional Energy approach, which incorporates, among others, the use of a proposed path-tracking scheme for guidance in the Altitude–Mach plane. The adoption of population-based solver serves to simplify case setup, allowing for direct interfaces between the optimizer and aircraft/engine performance codes. A two-level optimization scheme is employed and is shown to improve search performance compared to the basic PSO algorithm. The effectiveness of the proposed methodology is demonstrated in a hypothetic engine upgrade scenario for the F-4 aircraft considering the replacement of the aircraft’s J79 engine with the EJ200; a clear advantage of the EJ200-equipped configuration is unveiled, resulting, on average, in 15% faster climbs with 20% less fuel

    Adaptive particle swarm optimization

    Get PDF
    An adaptive particle swarm optimization (APSO) that features better search efficiency than classical particle swarm optimization (PSO) is presented. More importantly, it can perform a global search over the entire search space with faster convergence speed. The APSO consists of two main steps. First, by evaluating the population distribution and particle fitness, a real-time evolutionary state estimation procedure is performed to identify one of the following four defined evolutionary states, including exploration, exploitation, convergence, and jumping out in each generation. It enables the automatic control of inertia weight, acceleration coefficients, and other algorithmic parameters at run time to improve the search efficiency and convergence speed. Then, an elitist learning strategy is performed when the evolutionary state is classified as convergence state. The strategy will act on the globally best particle to jump out of the likely local optima. The APSO has comprehensively been evaluated on 12 unimodal and multimodal benchmark functions. The effects of parameter adaptation and elitist learning will be studied. Results show that APSO substantially enhances the performance of the PSO paradigm in terms of convergence speed, global optimality, solution accuracy, and algorithm reliability. As APSO introduces two new parameters to the PSO paradigm only, it does not introduce an additional design or implementation complexity

    Decentralized MPC based Obstacle Avoidance for Multi-Robot Target Tracking Scenarios

    Full text link
    In this work, we consider the problem of decentralized multi-robot target tracking and obstacle avoidance in dynamic environments. Each robot executes a local motion planning algorithm which is based on model predictive control (MPC). The planner is designed as a quadratic program, subject to constraints on robot dynamics and obstacle avoidance. Repulsive potential field functions are employed to avoid obstacles. The novelty of our approach lies in embedding these non-linear potential field functions as constraints within a convex optimization framework. Our method convexifies non-convex constraints and dependencies, by replacing them as pre-computed external input forces in robot dynamics. The proposed algorithm additionally incorporates different methods to avoid field local minima problems associated with using potential field functions in planning. The motion planner does not enforce predefined trajectories or any formation geometry on the robots and is a comprehensive solution for cooperative obstacle avoidance in the context of multi-robot target tracking. We perform simulation studies in different environmental scenarios to showcase the convergence and efficacy of the proposed algorithm. Video of simulation studies: \url{https://youtu.be/umkdm82Tt0M

    Force-imitated particle swarm optimization using the near-neighbor effect for locating multiple optima

    Get PDF
    Copyright @ Elsevier Inc. All rights reserved.Multimodal optimization problems pose a great challenge of locating multiple optima simultaneously in the search space to the particle swarm optimization (PSO) community. In this paper, the motion principle of particles in PSO is extended by using the near-neighbor effect in mechanical theory, which is a universal phenomenon in nature and society. In the proposed near-neighbor effect based force-imitated PSO (NN-FPSO) algorithm, each particle explores the promising regions where it resides under the composite forces produced by the “near-neighbor attractor” and “near-neighbor repeller”, which are selected from the set of memorized personal best positions and the current swarm based on the principles of “superior-and-nearer” and “inferior-and-nearer”, respectively. These two forces pull and push a particle to search for the nearby optimum. Hence, particles can simultaneously locate multiple optima quickly and precisely. Experiments are carried out to investigate the performance of NN-FPSO in comparison with a number of state-of-the-art PSO algorithms for locating multiple optima over a series of multimodal benchmark test functions. The experimental results indicate that the proposed NN-FPSO algorithm can efficiently locate multiple optima in multimodal fitness landscapes.This work was supported in part by the Key Program of National Natural Science Foundation (NNSF) of China under Grant 70931001, Grant 70771021, and Grant 70721001, the National Natural Science Foundation (NNSF) of China for Youth under Grant 61004121, Grant 70771021, the Science Fund for Creative Research Group of NNSF of China under Grant 60821063, the PhD Programs Foundation of Ministry of Education of China under Grant 200801450008, and in part by the Engineering and Physical Sciences Research Council (EPSRC) of UK under Grant EP/E060722/1 and Grant EP/E060722/2

    A fuzzified systematic adjustment of the robotic Darwinian PSO

    Get PDF
    The Darwinian Particle Swarm Optimization (DPSO) is an evolutionary algorithm that extends the Particle Swarm Optimization using natural selection to enhance the ability to escape from sub-optimal solutions. An extension of the DPSO to multi-robot applications has been recently proposed and denoted as Robotic Darwinian PSO (RDPSO), benefiting from the dynamical partitioning of the whole population of robots, hence decreasing the amount of required information exchange among robots. This paper further extends the previously proposed algorithm adapting the behavior of robots based on a set of context-based evaluation metrics. Those metrics are then used as inputs of a fuzzy system so as to systematically adjust the RDPSO parameters (i.e., outputs of the fuzzy system), thus improving its convergence rate, susceptibility to obstacles and communication constraints. The adapted RDPSO is evaluated in groups of physical robots, being further explored using larger populations of simulated mobile robots within a larger scenario

    Optimal Control of an Uninhabited Loyal Wingman

    Get PDF
    As researchers strive to achieve autonomy in systems, many believe the goal is not that machines should attain full autonomy, but rather to obtain the right level of autonomy for an appropriate man-machine interaction. A common phrase for this interaction is manned-unmanned teaming (MUM-T), a subset of which, for unmanned aerial vehicles, is the concept of the loyal wingman. This work demonstrates the use of optimal control and stochastic estimation techniques as an autonomous near real-time dynamic route planner for the DoD concept of the loyal wingman. First, the optimal control problem is formulated for a static threat environment and a hybrid numerical method is demonstrated. The optimal control problem is transcribed to a nonlinear program using direct orthogonal collocation, and a heuristic particle swarm optimization algorithm is used to supply an initial guess to the gradient-based nonlinear programming solver. Next, a dynamic and measurement update model and Kalman filter estimating tool is used to solve the loyal wingman optimal control problem in the presence of moving, stochastic threats. Finally, an algorithm is written to determine if and when the loyal wingman should dynamically re-plan the trajectory based on a critical distance metric which uses speed and stochastics of the moving threat as well as relative distance and angle of approach of the loyal wingman to the threat. These techniques are demonstrated through simulation for computing the global outer-loop optimal path for a minimum time rendezvous with a manned lead while avoiding static as well as moving, non-deterministic threats, then updating the global outer-loop optimal path based on changes in the threat mission environment. Results demonstrate a methodology for rapidly computing an optimal solution to the loyal wingman optimal control problem
    corecore